Closed esamorodnitsky closed 4 years ago
Hi,
I also performed "grep" for BRAF in the GTF. It is in there.
can you please show me the lines for:
grep BRAF hg19.refGene.gtf
There is no line with the 3rd column is "gene"
awk '$3=="gene"' BRAF.txt
you'd better use a GTF from ENSEMBL:
$ wget -O - -q "ftp://ftp.ensembl.org/pub/grch37/current/gtf/homo_sapiens/Homo_sapiens.GRCh37.87.chr.gtf.gz" | gunzip -c | grep -w BRAF | awk '$3=="gene"'
7 ensembl_havana gene 140419127 140624564 . - .gene_id "ENSG00000157764"; gene_version "8"; gene_name "BRAF"; gene_source "ensembl_havana"; gene_biotype "protein_coding";
Excellent, downloading the new GTF file worked. The GTF that I was using I got from UCSC. Another thing, is Backlocate 0-based? I ran it on BRAF V600E and got there numbers:
BRAF Val 600 Glu BRAF ENST00000288602 - V 1797 GTG GAG G chr7 140453136 ENST00000288602.Exon15 . . BRAF Val 600 Glu BRAF ENST00000288602 - V 1798 GTG GAG T chr7 140453135 ENST00000288602.Exon15 . . BRAF Val 600 Glu BRAF ENST00000288602 - V 1799 GTG GAG G chr7 140453134 ENST00000288602.Exon15 . .
But, on UCSC, the positions are between chr7:140,453,135-140,453,137.
Another thing, is Backlocate 0-based?
Yes
Ok, great! Thanks a lot!
Verify
Subject of the issue
Backlocate can't seem to find any of the gene names in the GTF I give it.
Your environment
${JAVA_HOME}
: I don't knowSteps to reproduce
echo -e 'BRAF\tV600E\n' | java -jar backlocate.jar --gtf hg19.refGene.gtf -R hg19_chr.fasta
Expected behaviour
I expect a table with the possible list of mutations for BRAF V600E.
Actual behaviour
User.Gene AA1 petide.pos.1 AA2 transcript.name transcript.id transcript.strand transcript.AA index0.in.rna wild.codon potential.var.codons base.in.rna chromosome index0.in.genomic exon messages extra.user.data
[WARN][BackLocate]no transcript found for BRAF
I also performed "grep" for BRAF in the GTF. It is in there.