JEFworks-Lab / HoneyBADGER

HMM-integrated Bayesian approach for detecting CNV and LOH events from single-cell RNA-seq data
http://jef.works/HoneyBADGER/
GNU General Public License v3.0
95 stars 31 forks source link

Error: dim(X) must have a positive length #12

Closed romanhaa closed 5 years ago

romanhaa commented 5 years ago

Hi! I'm currently trying to use HoneyBADGER but run into an error when executing hb$plotGexpProfile().

Here is the set of commands I ran:

mart.obj <- useMart(biomart="ENSEMBL_MART_ENSEMBL", dataset='mmusculus_gene_ensembl', host='jul2015.archive.ensembl.org')
hb <- new('HoneyBADGER', name='project')
hb$setGexpMats(matrix.sample, matrix.ref, mart.obj, filter=FALSE, scale=FALSE, verbose=TRUE)
# Initializing expression matrices ...
# Normalizing gene expression for 36544 genes and 2551 cells ...
# Done setting initial expression matrices!
hb$plotGexpProfile()
# Error in apply(d, 2, caTools::runmean, k = window.size, align = "center") :
#   dim(X) must have a positive length

The matrices matrix.sample and matrix.ref contain 2551 and 2456 cells, respectively, with 36544 genes for each cell.

Do you have an idea what the problem could be? I would greatly appreciate your help.

Thanks!

JEFworks commented 5 years ago

Hi Roman,

Thanks for trying out HoneyBADGER!

I believe the error is due to the data being from mouse. A number of default options have been set under the assumption that the data is from human. For example, hb$setGexpMats has an id parameter that is by default 'hgnc_symbol'.

You will want to specify 'mgi_symbol' in your case assuming that the rownames in your matrix.sample are indeed MGI symbols. You can specify other biomaRt identifiers if your rownames are ENSEMBL or something else:

hb$setGexpMats(matrix.sample, matrix.ref, mart.obj, 
    filter=FALSE, scale=FALSE, verbose=TRUE, id='mgi_symbol')

setGexpMat then uses this id and your mart.obj to map genes to their genomic positions in the chromosome. You can double check that this is being done properly by looking at hb@gene; it should contain a GenomicRanges representation of genomic positions for all your genes.

Now, hb$plotGexpProfile() should be able to organize the genes by their genomic positions and make the appropriate visualization.

You can learn more about each function using and their default parameters:

?setGexpMats

Let me know if this fixes the issue.

Best, Jean

romanhaa commented 5 years ago

That worked, thanks a lot! I guess I should've checked the additional parameters :)