zwdzwd / transvar

TransVar - multiway annotator for precision medicine
Other
118 stars 33 forks source link

c.-22A>G works, c.-19-3A>G doesn't #21

Open mjafin opened 6 years ago

mjafin commented 6 years ago

Hi there, Thanks for such a great tool, I've found transvar to be a massive time saver. I spotted an issue that would be great if it was addressed at some point. For intronic variants there are (at least) two notations: NM_007294.3:c.-19-3A>G NM_007298.3:c.-22A>G

The second one works whereas the second one produces empty fields.

See https://www.ncbi.nlm.nih.gov/clinvar/variation/125471/

Is this something you could look at?

zwdzwd commented 6 years ago

Thanks for bringing up this possibility.

I might have fixed it now. Now

$ transvar canno -i 'NM_007294:c.-22A>G' --refseq
input   transcript      gene    strand  coordinates(gDNA/cDNA/protein)  region  info
NM_007294:c.-22A>G      NM_007294 (protein_coding)      BRCA1   -       chr17:g.41276135T>C/c.1-22A>G/. inside_[5-UTR;intron_between_exon_1_and_2]      CSQN=IntronicSNV;dbsnp=rs273898669(chr17:41276135T>C);dbxref=GeneID:672,HGNC:1100,MIM:113705;aliases=NP_009225;source=RefSeq

$ transvar canno -i 'NM_007294:c.-19-3A>G' --refseq
input   transcript      gene    strand  coordinates(gDNA/cDNA/protein)  region  info
NM_007294:c.-19-3A>G    NM_007294 (protein_coding)      BRCA1   -       chr17:g.41276135T>C/c.1-22A>G/. inside_[5-UTR;intron_between_exon_1_and_2]      CSQN=IntronicSNV;dbsnp=rs273898669(chr17:41276135T>C);dbxref=GeneID:672,HGNC:1100,MIM:113705;aliases=NP_009225;source=RefSeq

Before

$ transvar canno -i 'NM_007294:c.-22A>G' --refseq
input   transcript      gene    strand  coordinates(gDNA/cDNA/protein)  region  info
NM_007294:c.-22A>G      NM_007294 (protein_coding)      BRCA1   -       chr17:g.41276135T>C/c.1-22A>G/. inside_[5-UTR;intron_between_exon_1_and_2]      CSQN=IntronicSNV;dbsnp=rs273898669(chr17:41276135T>C);dbxref=GeneID:672,HGNC:1100,MIM:113705;aliases=NP_009225;source=RefSeq
$ transvar canno -i 'NM_007294:c.-19-3A>G' --refseq
input   transcript      gene    strand  coordinates(gDNA/cDNA/protein)  region  info
NM_007294:c.-19-3A>G    .       .       .       ././.   .       no_valid_transcript_found

Let me know if it may still fail at other occasions.