zwdzwd / transvar

TransVar - multiway annotator for precision medicine
Other
115 stars 34 forks source link

three different results for the same canno input - which one is correct? #30

Open orthoceros opened 5 years ago

orthoceros commented 5 years ago

Dear TransVar developers, I have found a canno annotation problem with 2.4.8.20190122 that did not occur with 2.4.1.20180815 (same reference fasta and annotation files). After downgrading, it is working again:

With the old version, variants 'NM_001760.3:c.786_796dup' and 'NM_001760.3:c.758_759dupAG' from a SangerSeq experiment annotated without any errors, for example: columns = input transcript gene strand coordinates(gDNA/cDNA/protein) region info. values = 'NM_001760.3:c.758_759dupAG' 'NM_001760 (protein_coding)' 'CCND3' '-' 'chr6:g.41936061_41936062dupTC/c.758759dupAG/p.S254Rfs*103' 'inside[cds_in_exon_5]' 'CSQN=Frameshift;left_align_gDNA=g.41936057_41936058insCT;unalign_gDNA=g.41936058_41936059dupCT;left_align_cDNA=c.756_757insGA;unalign_cDNA=c.758_759dupAG;dbxref=GeneID:896,HGNC:HGNC:1585,MIM:123834;aliases=NP_001751;source=RefSeq'

With the new version, both dup variants result in no_valid_transcript_found. Querying SNVs 'NM_001760.3:c.758A>C' and 'NM_001760.3:c.759G>T' works with the new version, so theindividual reference bases seem to be correct (while, e.g., 'NM_001760.3:c.759C>G' or 'NM_001760.3:c.759A>G' fail as expected). Interestingly, querying the dup equivalent 'NM_001760.3:c.758_759AG>AGAG' still works with the new version: 'NM_001760.3:c.758_759AG>AGAG' 'NM_001760.4 (protein_coding)' 'CCND3' '-' 'chr6:g.41936061_41936062dupTC/c.760_761dupAG/p.S254Rfs*103' 'inside_[cds_in_exon_5]' 'CSQN=Frameshift;left_align_gDNA=g.41936057_41936058insCT;unalign_gDNA=g.41936058_41936059dupCT;left_align_cDNA=c.756_757insGA;unalign_cDNA=c.758_759dupAG;dbxref=GeneID:896,HGNC:HGNC:1585,MIM:123834;aliases=NP_001751;source=RefSeq' This output suggests that after the transcript update from NM_001760.3 to NM_001760.4, this variant should now be named 'NM_001760.4:c.760_761dupAG'. But using this TransVar output as input (same new version 2.4.8.20190122) results in no_valid_transcript_found, again...?

Which of the three results is correct? Thanks.