zwdzwd / transvar

TransVar - multiway annotator for precision medicine
Other
115 stars 34 forks source link

Querying upstream and downstream (including 5'UTR and 3'UTR) #9

Open chelseaju opened 7 years ago

chelseaju commented 7 years ago

Hi, Thanks for making this software available on Github. It is a very great tool, which makes my life easier to retrieve the genomic position when looking at variations.

I have a few questions regarding retrieving positions upstream and downstream for a gene. If the mutation is upstream of the starting codon (either 5' flanking region or 5'UTR), the HGVS notation usually starts with a negative sign. For example "KCNJ11c.-134G>T" (http://www.ncbi.nlm.nih.gov/clinvar/variation/8674/) . The negative notation fails on the software as it complains "invalid position string". A work-around solution is to replace the negative notation with "1-". If the query is modified as KCNJ11c.1-134G>T, it works.

On the other hand, working with the downstream mutation is a bit tricky, for example c.*143A>T. Since it is not trivial to get the position of the stop codon, I am wondering if there is a way to handle this situation. Thanks!!

zwdzwd commented 7 years ago

Thanks for the suggestion. I have modified the code to support SNV annotation in 5' UTR without adding "1-" and 3' UTR. I have this reflected in the documentation as well. where I used KCNJ11:c.-134G>T and MSH2:c.*95C>T as examples. http://transvar.readthedocs.io/en/latest/annotation_from_cdna_level.html#cdna-variant Region annotation should work the same. It is possible that deletion and insertion still has some glitches. I am testing it over the clinvar. If you find something, certainly let me know! Thanks.