broadinstitute / seqr

web-based analysis tool for rare disease genomics
GNU Affero General Public License v3.0
176 stars 89 forks source link

Request to update in silico tool - FATHMM #2027

Closed lynnpais closed 1 year ago

lynnpais commented 3 years ago

Is your feature request related to a problem? Please describe. NA

Describe the solution you'd like Update to FATHMM_MKL which includes annotations for noncoding variants - http://fathmm.biocompute.org.uk/fathmmMKL.htm

Describe alternatives you've considered NA

Additional context Discussed with Mike and the analysis team and thought it best to update.

hanars commented 3 years ago

Created https://github.com/broadinstitute/hail-elasticsearch-pipelines/issues/264 to track the work to get the data into the pipeline. Showing in seqr is blocked on the pipeline work

mike-w-wilson commented 3 years ago

Downloading v4.2a from https://sites.google.com/site/jpopgen/dbNSFP which is where we pull FATHMM from

hanars commented 1 year ago

@lynnpais just to clarify we should replace the current fathmm with the new fathmm MKL, or we should show both? This data is availabein the v3 pipeline so it willbe added to seqr when the new backend goes live

lynnpais commented 1 year ago

Yes, replace the current fathmm with the new fathmm MKL which is more comprehensive.

On Wed, Jul 12, 2023 at 11:54 AM hanars @.***> wrote:

@lynnpais https://github.com/lynnpais just to clarify we should replace the current fathmm with the new fathmm MKL, or we should show both?

— Reply to this email directly, view it on GitHub https://github.com/broadinstitute/seqr/issues/2027#issuecomment-1632797610, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJA6EOX2KINCTC56CMAVQRTXP3CELANCNFSM5BRO6J7Q . You are receiving this because you were mentioned.Message ID: @.***>

hanars commented 1 year ago

Note to self: possible values have changed, from ['D', 'T'] to ['D', 'N']

hanars commented 1 year ago

We also get VEST4 and MutPred as part of the corresponding dbnsfp update, so we will ad those to the UI as well as part of this work. @lynnpais or @anneodonnell is there any guidance for how we want to color code these scores (i.e. what the red/yellow/green cutoffs are)

These are all available in ES as well as the v3 pipeline, so add support for both. In the v2 pipeline MutPred is in the form "0.387" and VEST4 is ".;0.015;0.015;.;0.014" or "0.491". In v3 these are single floats, for vest is parsed by field.split(';').find(lambda p: p != '.')

lynnpais commented 1 year ago

VEST4 Green: ≤0.449 Yellow: 0.5-0.763 Red: ≥0.764

MutPred Green: ≤0.391 Yellow: 0.392-0.736 Red: ≥0.737

Additional info: these values are from Table 2 in a new paper from the ClinGen team - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9748256/. We'll need to update ranges for existing predictors in seqr. I'll submit a separate ticket for this once I discuss with the analyst team.

lynnpais commented 1 year ago

Analysts would like to discuss this further. Will try to share an update later this week.

hanars commented 1 year ago

FATHMM's source has been updated and vest and MutPred added