zwdzwd / transvar

TransVar - multiway annotator for precision medicine
Other
116 stars 34 forks source link

RefSeq hg19 #7

Closed sarmadym closed 8 years ago

sarmadym commented 8 years ago

--refseq switch for hg19 annotation uses annotation release 105 which is a bit old. I understand that NCBI switched to GRCh38 reference for their provided refseq gff files for subsequent releases. Is there a way to have transvar work with UCSC RefSeq downloaded files?

zwdzwd commented 8 years ago

Sorry just saw this. Reposting my answers here in case others are curious:

1) in UCSC table browser, choose hg19, gene annotation, refseq, "all fields from selected table" as output, save as ucsc.refgene.hg19.txt.gz

2) transvar index --ucsc ~/Downloads/ucsc.refgene.hg19.txt.gz, this will create a bunch of transvardb*

Now we can use the latest RefSeq with hg19. 3) transvar ganno -i "chr4:g.144918713T>C" --ucsc ~/Downloads/ucsc.refgene.hg19.txt.gz.transvardb input transcript gene strand coordinates(gDNA/cDNA/protein) region info chr4:g.144918713T>C NM_001304382 (proteincoding) GYPB - chr4:g.144918713T>C/c.172A>G/p.S58G inside[cds_in_exon_5] CSQN=Missense;codon_pos=144918711-144918712-144918713;ref_codon_seq=AGT;source=UCSCRefGene chr4:g.144918713T>C NM_002100 (proteincoding) GYPB - chr4:g.144918713T>C/c.250A>G/p.S84G inside[cds_in_exon_4] CSQN=Missense;codon_pos=144918711-144918712-144918713;ref_codon_seq=AGT;source=UCSCRefGene