Closed danshu closed 5 years ago
I noticed that there are less expressed genes in Batch 2 than Batch 1. sum(nexprs(sce1, byrow=TRUE) > 0)=27746 sum(nexprs(sce2, byrow=TRUE)>0)=21184 sum(nexprs(sce1, byrow=TRUE)>3)=22003 sum(nexprs(sce2, byrow=TRUE)>3)=16851
Is this this causing the problem? Should I only use variable genes expressed in both batches for mnnCorrect?
Hmm, everything looks fine to me. The only cause for NA
s would be if you were unfortunate enough to have a cell that was all-zero in the chosen
subset; this would result in a L2 norm of zero and NA
s after cosine normalization. But this doesn't seem to be the case, as you've turned off cosine norm.
I am a bit bemused, but perhaps you could see if the problem persists in the batchelor version of this function? The scran version will be deprecated in the next release as all batch-correlation-related functions are moving over to batchelor. (Otherwise scran gets too bloated.)
This may have been fixed by the patch described in MarioniLab/FurtherMNN2018#11.
In any case, I am closing this unless further information is provided.
Hi, I found that in corrected batch2 data, most of values are NaN. Here are my codes for running mnnCorrect. In logcounts(sce1) and logcounts(sce2), there are no NAs or infinite values.
fit1 <- trendVar(sce1, parametric=TRUE,use.spikes=FALSE) dec1 <- decomposeVar(sce1, fit1) fit2 <- trendVar(sce2, parametric=TRUE,use.spikes=FALSE) dec2 <- decomposeVar(sce2, fit2) universe <- intersect(rownames(dec1), rownames(dec2)) mean.bio <- (dec1[universe,"bio"] + dec2[universe,"bio"])/2 chosen <- universe[mean.bio > 0] original <- list(batch1=logcounts(sce1)[chosen,],batch2=logcounts(sce2)[chosen,]) mnn.out <- do.call(mnnCorrect, c(original, list(k=20,cos.norm.in=FALSE,cos.norm.out=FALSE))) rmat <- do.call(cbind, list(counts1=counts(sce1)[chosen,],counts2=counts(sce2)[chosen,])) mnndat <- do.call(cbind, list(mnn1=mnn.out$corrected[[1]],mnn2=mnn.out$corrected[[2]])) sce <- SingleCellExperiment(list(counts=rmat,logcounts=mnndat))
Best, danshu