mutalyzer / mutalyzer2

HGVS variant nomenclature checker
https://mutalyzer.nl
Other
98 stars 23 forks source link

Undocumented: Notations supported in addition to the HGVS #478

Closed ifokkema closed 5 years ago

ifokkema commented 5 years ago

I know Mutalyzer's name checker supports some notations, additional to the HGVS nomenclature. Examples are NC_000020.10(NM_000214.2):c.IVS3+4del and NM_000214.2:c.EX3del. I was looking for the latter but I couldn't get it to work quickly, and I found it to be documented nowhere. Perhaps it would be good to add these descriptions here: https://github.com/mutalyzer/mutalyzer/wiki/HGVS-Mutalyzer-Differences

P.S. When I tried related descriptions, like NM_000214.2:c.E3del, I got the error message Expected "IVS" (at char 14), (line:1, col:15), which is of course quite confusing.

jfjlaros commented 5 years ago

I am not sure we should document this, as it would encourage using non-HGVS descriptions.

Since we use a generic parser for parsing the input, we have little control over the error messages it produces. The parser tries to parse the input until it has no valid way of continuing, at this point it will report the rule it is trying to follow. I do agree that it is unfortunate that this particular rule is not part of the HGVS specification.

ifokkema commented 5 years ago

I am not sure we should document this, as it would encourage using non-HGVS descriptions.

I would think the reverse; documenting this may stimulate others to finally convert their non-HGVS descriptions into HGVS (that is what I was using it for). It doesn't work with the position converter, just with the name checker that people can use to correct their errors.

(...) at this point it will report the rule it is trying to follow.

I'm pretty sure the code would be to complex for me, but I suppose the EX rule would be closer to E than IVS is? Well, if it's not easily fixed, it doesn't matter too much. If a workaround would be possible, that would be nice, as users putting c.del123 instead of c.123del get the same IVS message, even though a human would quickly realize what they meant. Your logs could show if this is an error that people often make or not, and if it would be worth the time.

jfjlaros commented 5 years ago

I would think the reverse, ...

You have a point there. Perhaps I am biased because I abuse it myself to figure out what the coordinates of an exon are.

... I suppose the EX rule would be closer ...

For these types of parsers tokens are used and they perform no approximate matching within these tokens. So the token EX and IVS are equally distant from the string E. I do not think there is a solution for this problem in general (look at some compiler or interpreter errors for example, that totally miss the obvious cause of a malformed piece of code).

ifokkema commented 5 years ago

You have a point there.

If you would reconsider, it would be great! I have a user who sent me a bunch of "Deletion of exons ..." variants; now I couldn't point to any documentation, but having a URL somewhere to point to will save me time trying to figure it out again and explaining it :)

I do not think there is a solution for this problem in general

I understand; thanks for explaining it!

jfjlaros commented 5 years ago

Sure, I will give it a try. Here I have written down what I remember.

ifokkema commented 5 years ago

Looks great, thank you! Will it be linked to from somewhere? For now, I'll use the URL to forward it to users.

jfjlaros commented 5 years ago

It is listed as one of the pages in the documentation, I guess that will suffice?

ifokkema commented 5 years ago

Since the page starts with a "D", it shows, so it could suffice. But if users just read the page and not the side bar, and navigate from there, they might appreciate a link from the Variant Descriptions page maybe?

I'll leave that decision to you, closing the issue regardless. Thanks again!

jfjlaros commented 5 years ago

I put a link at the bottom of this page.