BIMSBbioinfo / RCAS

R package for the RNA Centric Annotation System (RCAS)
6 stars 3 forks source link

Error with `runReport` #12

Closed JoKerDii closed 3 years ago

JoKerDii commented 3 years ago

Hi!

I'm trying to get annotation summary for m6A peak calling result. My bed file is named "Mod.bed". I tried to run this but got an error, which I don't know why

library(RCAS)
hg19_mod_path <- "/path/to/Mod.bed"
gff_path <- "/path/to/GRCh37_RefSeq_24.gff"
runReport( queryFilePath = hg19_mod_path,
           gffFilePath = gff_path,
           motifAnalysis = FALSE,
           goAnalysis = FALSE )

However, there is no problem to get enriched motif by running the following code

queryRegions <- importBed(filePath = hg19_mod_path, sampleN = 10000)
gff <- importGtf(filePath = "/path/to/GRCh37_RefSeq_24.gff")
motifResults <- runMotifDiscovery(queryRegions = queryRegions, 
                                  resizeN = 15, sampleN = 10000,
                                  genomeVersion = 'hg19', motifWidth = 5,
                                  motifN = 3, nCores = 5)
ggseqlogo::ggseqlogo(motifResults$matches_query)
summary <- getMotifSummaryTable(motifResults)
knitr::kable(summary)

Did I do anything wrong with the code usage? Is there any way to avoid error in runReport and got the annotation summary?

borauyar commented 3 years ago

Hi @JoKerDii, Could you maybe tell us what the error message you get is? The commands look good to me, but it is hard to pinpoint the problem without seeing the error message.

JoKerDii commented 3 years ago

I forgot it, sorry about that. Here are the running message and error message

Thu Mar  4 17:38:44 2021 => Returning BSgenome.Hsapiens.UCSC.hg19 as BSgenome object

processing file: report.Rmd
  |.                                                                        |   2%
   inline R code fragments

  |...                                                                      |   4%
label: global_options (with options) 
List of 1
 $ include: logi FALSE

  |....                                                                     |   6%
  |......                                                                   |   8%
  |.......                                                                  |  10%
  |........                                                                 |  12%
  |..........                                                               |  13%
  |...........                                                              |  15%
  |.............                                                            |  17%
  |..............                                                           |  19%
  |...............                                                          |  21%
  |.................                                                        |  23%
Processing /data/zhendi/protocol/hisat_alignment/homo_re/hg19/exomePeak2_output_mod/Mod.bed
Keeping standard chromosomes only
Reading existing granges.rds object from /data/zhendi/wei/star-genome-ref/g1/Homo_sapiens/UCSC/hg19/Annotation/Archives/archive-2015-07-17-14-32-32/Genes/genes.gtf.granges.rds
Keeping standard chromosomes only
File /data/zhendi/wei/star-genome-ref/g1/Homo_sapiens/UCSC/hg19/Annotation/Archives/archive-2015-07-17-14-32-32/Genes/genes.gtf.granges.rds already exists.
                  Use overwriteObjectAsRds = TRUE to overwrite the file
  |..................                                                       |  25%
  |....................                                                     |  27%
  |.....................                                                    |  29%
  |......................                                                   |  31%
  |........................                                                 |  33%
  |.........................                                                |  35%
  |...........................                                              |  37%
  |............................                                             |  38%
  |.............................                                            |  40%
  |...............................                                          |  42%
  |................................                                         |  44%
  |..................................                                       |  46%
  |...................................                                      |  48%
  |....................................                                     |  50%
  |......................................                                   |  52%
  |.......................................                                  |  54%
Quitting from lines 200-220 (report.Rmd) 
Error in setnames(x, value) : 
  Can't assign 2 names to a 1 column data.table
borauyar commented 3 years ago

It fails at the step where gene biotypes are parsed from the gtf file to make a plot about which kind of genes are targeted.

The reason it fails must be that the GTF file you have maybe doesn't contain a 'gene_biotype' field.

Could you also maybe test the function using a GTF file from ENSEMBL? You can download the annotation based on hg19 (GRCh37 build) here: ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens

borauyar commented 3 years ago

https://github.com/BIMSBbioinfo/RCAS/commit/3cc9d41474cdf0073a3ab0f29403fdd226fa3631

JoKerDii commented 3 years ago

Yes you are right! I tested with the GTF file from ENSEMBL and there is no error. Thanks!!!

borauyar commented 3 years ago

That's great. I updated the code in any case to check for this field. It is available in the master branch now. It should be available on bioconductor in the next release. Thanks for the feedback!