jianhong / ribosomeProfilingQC

An R package for quality assessment of ribosome profiling
GNU General Public License v3.0
3 stars 0 forks source link

GTF construction for readsEndPlot() #6

Closed paolabc closed 6 months ago

paolabc commented 6 months ago

Hi

Thank you for this amazing package and all the details given in the ribosomeprofilingQC guide. I just would like to clarify the construct of makeTxDbFromGFF to provide to readsEndPlot().

I used genome and gtf annoation from ensembl Gz11 for Danio Rerio . The guide provides refgene from danRer10 UCSC (TxDb.Drerio.UCSC.danRer10.refGene) to pass as genome. If I would like to set the genome for ensembl, should I set the genome as org.Dr.eg.db. Does the readsEndPlot() identify it?

Thank you in advance for your help.

Code

txdb <- makeTxDbFromGFF( "/REFERENCES/drerio_chr1_to_25.gtf",organism = "Danio rerio",
         chrominfo = Seqinfo(seqnames = unique_chr, seqlengths = seqlengths,isCircular=c(FALSE, FALSE, FALSE, FALSE,FALSE,FALSE, FALSE, FALSE, FALSE,FALSE,FALSE, FALSE, FALSE, FALSE,FALSE,FALSE, FALSE, FALSE, FALSE,FALSE,FALSE, FALSE, FALSE, FALSE,FALSE),genome="org.Dr.eg.db"),
         taxonomyId=7955)
CDS <- prepareCDS(txdb)
readsEndPlot(bamfile, CDS, toStartCodon=TRUE)
jianhong commented 6 months ago

genome should be BSgenome.Drerio.UCSC.danRer11 . Let me know if you have any issues when running the pipeline. I don't thinks using ensembl annotation will affect the pipeline. In background, the program will try to fix the chromosome name style.

paolabc commented 6 months ago

I ran the code I sent,and there is no error. Is it right? I do not know how it the behaviour of readsEndPlot() using org.Dr.eg.db. Does it uses or does not?

I also tried the code

library(ribosomeProfilingQC)
library(BSgenome.Drerio.UCSC.danRer11)

txdb <- makeTxDbFromGFF( "REFERENCES/drerio_chr1_to_25.gtf", organism = "Danio rerio",
         chrominfo = Seqinfo(seqnames = unique_chr, seqlengths = seqlengths,isCircular=c(FALSE, FALSE, FALSE, FALSE,FALSE,FALSE, FALSE, FALSE, FALSE,FALSE,FALSE, FALSE, FALSE, FALSE,FALSE,FALSE, FALSE, FALSE, FALSE,FALSE,FALSE, FALSE, FALSE, FALSE,FALSE),genome="danRer11"),
         taxonomyId=7955)
CDS <- prepareCDS(txdb)
readsEndPlot(bamfile, CDS, toStartCodon=TRUE) 

Error in readsEndPlot(bamfile, CDS, toStartCodon = TRUE) : Not enough data available. In addition: Warning message: In (function (seqlevels, genome, new_style) : cannot switch danRer11's seqlevels from UCSC to NCBI style

jianhong commented 6 months ago

try to add parameter ignore.seqlevelsStyle=TRUE when you run readsEndPlot and make sure that your seqname style of the txdb are keeping same as it in the bam file.

paolabc commented 6 months ago

Hi

Thank you for your answer! I was using another version that did not provide ignore.seqlevelsStyle=TRUE for any function. Now it works for me!