SACGF / variantgrid

VariantGrid public repo
Other
23 stars 2 forks source link

ClinGenAllele - MT not in GRCh37 #1018

Open davmlaw opened 6 months ago

davmlaw commented 6 months ago

CA658659094 is on MT (which is contig NC_012920.1 - same on both builds)

However, it only lists it as being for GRCh38 - so 37 fails with:

CA658659094/GRCh37 not in ClinGenAllele genomicAlleles response 

I should contact ClinGen about having 37 response as well

And we could handle this case as the contigs are the same

davmlaw commented 6 months ago

Made it work via contigs (so that 37 will look for the shared contig in GRCh38)

This should only take a few extra string comparisons per lookup so not too bad

EmmaTudini commented 3 months ago

Testing:

Actual output: Failed 4,5 both went to the allele page when genome build set to GRCh38, without any warning that the alt was different BUT correct warning was shown when preferred build was GRCh37 (even though variant was imported as GRCh38)

7 – showed no results (in both genome builds)

@davemlaw – not sure whether this is a MT contig issue or an ENST search issue

@davemlaw Slightly separate issues

  1. On the allele page, the MT variants are shown as a g. rather than an m. (as per standards). Can this be changed as a separate issue? See doc with standards from the UK here - https://www.acgs.uk.com/media/11935/bpg-for-the-molecular-diagnosis-of-mitochondrial-disease_ratified-november-2020.pdf
  2. Clicking on the g.HGVS takes you to the variant in the wrong genome build. I think it might default to your internal genome build setting. Screenshots below:
  3. Screenshot 2024-06-27 at 2.05.20 pm.png
Screenshot 2024-06-27 at 2.05.27 pm.png

Also a note that we don’t allow for import of these variants in Shariant at the moment (as far as I can tell) – have raised https://github.com/SACGF/variantgrid_shariant/issues/169 to test

EmmaTudini commented 3 months ago

@davmlaw I'm moving this to another release, when we turn on MT variants. @TheMadBug Can you confirm that we don't current allow MT variants at the moment please?

TheMadBug commented 3 months ago

Right now the changes for ImportedAlleleInfo restrict users to importing only to c.HGVS and g.HGVS (not variant coordinate, even if the rest of the system is technically capable of importing via that).

Attempting to put MT:11037 A>AC into c_hgvs just results in a c.HGVS parsing error.

In future we will want to be able to import via variant_coordinate again (if just for communicating between systems) but can confirm right now we don't have to worry about users inserting new variants using it - and we control the importers for all but 1 of the systems at this point now anyway too.

EmmaTudini commented 3 months ago

@TheMadBug But you can import a chgvs that resolves to an MT - ENST00000361381.2(MT-ND4):c.1A>C.

TheMadBug commented 3 months ago

Yes, as currently when it comes to c.HGVS we explicitly accept or not, it's just based on the transcript prefix which we limit to NM, NR, ENST and XR and that it's a c. or g.

From your comment 2 weeks ago it sounds like you've already imported one, but here's the example you just provided https://test.shariant.org.au/classification/imported_allele_info/35539

As for all our importers, there's nothing that would cause them to reject such a value as from a programming point of view it appears as a pretty standard c.HGVS - the main thing being that the last time we even got an Ensembl transcript was in 2022 from CHW for a record that was made in 2020

EmmaTudini commented 3 months ago

Could this be prioritised for the next deploy? I might start tagging issues. There were a few searches that failed/returned false positives

davmlaw commented 3 months ago

7 ... should show result found with alternative alt and link to allele page... without any warning that the alt was different

This was reported in "outstanding search issues" over a year ago but not done. I have split it into its own issue: SACGF/variantgrid#1106

EmmaTudini commented 3 months ago

@davmlaw There are other issues from this issue that need to be addressed outside of the one above. See my testing comment from June 27th

davmlaw commented 3 months ago

Added not reporting Reference base to "not reporting alt" issue - #1106

Separate issues:

  1. Mito g. instead of m. - raised as SACGF/variantgrid#1107
  2. MT build specific variant links - raised as #1108
davmlaw commented 3 months ago

Sorry I went through and looked for bold fails the first time. I think I got them all - If I missed anything else can you please either raise a new issue or explicitly tell me what I missed. Thansk

EmmaTudini commented 3 months ago

@davmlaw Would searching for the below and getting different results depending on the preferred genome build, fit into #1106 ENST00000361381.2(MT-ND4):c.278_279insT NC_012920.1:m.11037_11038insT

Both went straight to the allele page when genome build set to GRCh38, without any warning that the alt was different BUT correct warning was shown when preferred build was GRCh37 (even though variant was imported as GRCh38)

davmlaw commented 3 months ago

I think that's a new issue, as it's hgvs those other ones are for non hgvs and it looks to be mt and build specific

davmlaw commented 3 months ago

Split off into own issue - #1115 - Search for MT HGVS behaves differently across genome builds

This issue should now hopefully only be about being able to link ClinGenAlleles w/MT variants in GRCh37 (used to only work in 38)