velocyto-team / velocyto.R

RNA velocity estimation in R
http://velocyto.org
179 stars 224 forks source link

Error parsing GTF #5

Closed jw156605 closed 7 years ago

jw156605 commented 7 years ago

I get the following error message when I try to read in smart-seq2 bam files. It seems that velocyto doesn't like the GTF, but I can't figure out why. I obtained the GTF by downloading the hg19 "ucsc genes" track directly from the UCSC table browser. I also tried with the GTF that comes with the 10X Genomics annotation set and got a similar error message. The output is below. Any suggestions would be much appreciated. Thanks! library(velocyto.R) dat <- read.smartseq2.bams(c("scRNA-seq/D0HCF_A01/alignments.sort.bam","scRNA-seq/D0HCF_A02/alignments.sort.bam"),"velocyto_annotations/hg19_genes.gtf",n.cores=1) reading gene annotation ... done ( 1461416 genes) parsing exon information ... Error in [.data.frame(x, i, c(10, 11)) : undefined columns selected

pkharchenko commented 7 years ago

The current code doesn’t read GTF file, but a refFlat file. You can convert GTF to refFlat using UCSC gtfToGenePred tool:

gtfToGenePred genes.gtf genes.refFlat

Changing it to read GTF directly should be fairly straightforward, but I haven’t gotten around to that, unfortunately.

Best, -peter.

On Nov 13, 2017, at 9:56 AM, Joshua Welch notifications@github.com wrote:

I get the following error message when I try to read in smart-seq2 bam files. It seems that velocyto doesn't like the GTF, but I can't figure out why. I obtained the GTF by downloading the hg19 "ucsc genes" track directly from the UCSC table browser. I also tried with the GTF that comes with the 10X Genomics annotation set and got a similar error message. The output is below. Any suggestions would be much appreciated. Thanks! library(velocyto.R) dat <- read.smartseq2.bams(c("scRNA-seq/D0HCF_A01/alignments.sort.bam","scRNA-seq/D0HCF_A02/alignments.sort.bam"),"velocyto_annotations/hg19_genes.gtf",n.cores=1) reading gene annotation ... done ( 1461416 genes) parsing exon information ... Error in [.data.frame(x, i, c(10, 11)) : undefined columns selected

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/velocyto-team/velocyto.R/issues/5, or mute the thread https://github.com/notifications/unsubscribe-auth/ALT78p5k5DeoDcL96IkhUQB52Hb7Ru4cks5s2FiNgaJpZM4Qb45E.

jw156605 commented 7 years ago

Ah, of course. I hadn't noticed that the R tutorial used refFlat rather than genes.gtf. Thanks!