zwdzwd / transvar

TransVar - multiway annotator for precision medicine
Other
115 stars 34 forks source link

panno mode couldn't find coordinate Ensembl transcript ID #49

Open zhouzhendiao opened 2 years ago

zhouzhendiao commented 2 years ago

I used panno mode to transfer protein region to genomic region in reference genome hg19. For example, I want to get information on gene 'KMT2B' 's transcript ID ENST00000222270

When I searched:

transvar panno -i 'KMT2B' --ensembl

I got :

input   transcript      gene    strand  coordinates(gDNA/cDNA/protein)  region  info
KMT2B   .       .       .       ././.   .       no_valid_transcript_found

I downloaded gtf file from https://zhouserver.research.chop.edu/TransVar/annotations/hg19.ensembl.gtf.gz.

ENST00000222270 is in the database:

zcat hg19.ensembl.gtf.gz | grep ENST00000222270  |head
19      protein_coding  transcript      36208921        36229779        .       +       .       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055";
19      protein_coding  exon    36208921        36209283        .       +       .       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; exon_number "1"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055"; exon_id "ENSE00001625606";
19      protein_coding  CDS     36208921        36209283        .       +       0       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; exon_number "1"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055"; protein_id "ENSP00000222270";
19      protein_coding  start_codon     36208921        36208923        .       +       0       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; exon_number "1"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055";
19      protein_coding  exon    36210371        36210443        .       +       .       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; exon_number "2"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055"; exon_id "ENSE00000699758";
19      protein_coding  CDS     36210371        36210443        .       +       0       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; exon_number "2"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055"; protein_id "ENSP00000222270";
19      protein_coding  exon    36210686        36212706        .       +       .       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; exon_number "3"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055"; exon_id "ENSE00001388562";
19      protein_coding  CDS     36210686        36212706        .       +       2       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; exon_number "3"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055"; protein_id "ENSP00000222270";
19      protein_coding  exon    36213261        36213374        .       +       .       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; exon_number "4"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055"; exon_id "ENSE00000699705";
19      protein_coding  CDS     36213261        36213374        .       +       0       gene_id "ENSG00000272333"; transcript_id "ENST00000222270"; exon_number "4"; gene_name "KMT2B"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "KMT2B-201"; transcript_source "ensembl"; tag "CCDS"; ccds_id "CCDS46055"; protein_id "ENSP00000222270";

But not in the transvardb

 grep ENST00000222270 hg19.ensembl.gtf.gz.transvardb

Can somebody tell me why? Thanks!