zwdzwd / transvar

TransVar - multiway annotator for precision medicine
Other
115 stars 34 forks source link

inconsistence #19

Open hh1985 opened 6 years ago

hh1985 commented 6 years ago

Hi,

I am working on converting variants of different formats to gDNA variants I found some items gave different results:

commandline (version : 2.3.4)

hanh@cpuserver:/data/home$ transvar panno -i "EGFR:p.D770_N771insNPG" --ccds
input   transcript      gene    strand  coordinates(gDNA/cDNA/protein)  region  info
EGFR:p.D770_N771insNPG  CCDS5514 (protein_coding)       EGFR    +       chr7:g.55249017_55249018insTGGTAACCC/c.2315_2316insTGGTAACCC/p.P772_H773insGNP  inside_[cds_in_exon_20] CSQN=InFrameInsertion;left_align_protein=p.D770_N771insNPG;unalign_protein=p.D770_N771insNPG;left_align_gDNA=g.55249012_55249013insAACCCTGGT;unalign_gDNA=g.55249012_55249013insAACCCTGGT;left_align_cDNA=c.2310_2311insAACCCTGGT;unalign_cDNA=c.2310_2311insAACCCTGGT;32_CandidatesOmitted;source=CCDS
hanh@cpuserver:/data/home$ transvar panno -i "EGFR:p.D770_N771insSVQ" --ccds
input   transcript      gene    strand  coordinates(gDNA/cDNA/protein)  region  info
EGFR:p.D770_N771insSVQ  CCDS5514 (protein_coding)       EGFR    +       chr7:g.55249013_55249014insGCGTACAAA/c.2311_2312insGCGTACAAA/p.D770_N771insSVQ  inside_[cds_in_exon_20] CSQN=InFrameInsertion;left_align_protein=p.D770_N771insSVQ;unalign_protein=p.D770_N771insSVQ;left_align_gDNA=g.55249012_55249013insAGCGTACAA;unalign_gDNA=g.55249012_55249013insAGCGTACAA;left_align_cDNA=c.2310_2311insAGCGTACAA;unalign_cDNA=c.2310_2311insAGCGTACAA;48_CandidatesOmitted;source=CCDS

So here p.D770_N771insXXX gives me different genome coordinates.

The website returns:

EGFR:p.D770_N771insSVQ  CCDS5514 (protein_coding)   EGFR    +   chr7:g.(55249009ins9)/c.(2307_2308ins9)/p.D770_N771insSVQ   cds_in_exon_20  left_align_protein=p.D770_N771insSVQ;unalign_protein=p.D770_N771insSVQ;insertion_cDNA=AGCGTACAA;insertion_gDNA=AGCGTACAA;imprecise;source=CCDS
EGFR:p.D770_N771insNPG  CCDS5514 (protein_coding)   EGFR    +   chr7:g.(55249009ins9)/c.(2307_2308ins9)/p.P772_H773insGNP   cds_in_exon_20  left_align_protein=p.D770_N771insNPG;unalign_protein=p.D770_N771insNPG;insertion_cDNA=AACCCTGGT;insertion_gDNA=AACCCTGGT;imprecise;source=CCDS

Which one is the recommended one to follow?

Thanks,

-Han

zwdzwd commented 6 years ago

Hi Han,

The website version is behind the command line version by many major releases. I hope to update the website (which was hosted by MDA) in the future.

The command line version tried harder at back-inference of the genomic variants. The website only gives a "imprecise" location indicated by the parentheses. I would suggest going with the command line result.

Wanding

hh1985 commented 6 years ago

@zwdzwd Thanks for the suggestion!