Open kehuangke opened 3 years ago
I also saw the same question and got the same error, but I don't know how to solve. Could you help us to solve this problem?
I convert logcounts into counts, but this error still exists.
a <- assay(sce, "logcounts")
b <- expm1(a)
assays(sce)$counts <- b
I'm getting the same error when trying to use celldex's MouseRNAseq reference with scmap. Is there any update on this issue?
Same error. Any update?
@kehuangke @Cristinex @pcantalupo
After looking into the code of selectFeature()
, I think linearModel()
function might be the reason of this.
In linearModel()
, the dropout rate of a feature was defined as how many cells express 0 log_count in such feature.
However, this is not gonna work if there are no 0 log_count at all in the reference, leading to 0 dropout rates for all of the features.
linearModel <- function(object, n_features) {
log_count <- as.matrix(logcounts(object))
cols <- ncol(log_count)
if (!"counts" %in% assayNames(object)) {
warning("Your object does not contain counts() slot. Dropouts were calculated using logcounts() slot...")
dropouts <- rowSums(log_count == 0)/cols * 100
} else {
count <- as.matrix(counts(object))
dropouts <- rowSums(count == 0)/cols * 100
}
# do not consider genes with 0 and 100 dropout rate
dropouts_filter <- dropouts != 0 & dropouts != 100
dropouts_filter <- which(dropouts_filter)
dropouts <- log2(dropouts[dropouts_filter])
expression <- rowSums(log_count[dropouts_filter, ])/cols
fit <- lm(dropouts ~ expression)
And if dropout rates are all 0%, we can not get fit <- lm(dropouts ~ expression)
in linearModel()
to work, resulting from the fact that we will filter out genes with 0 dropout rate. So, no features will be considered at the end.
In conclusion, copy the function from the source code and modified this line
dropouts <- rowSums(log_count == 0)/cols * 100
to something like
dropouts <- rowSums(log_count <= 3)/cols * 100
would work. (The value defining the dropout cutoff depends on your reference.)
or
One can modified this line to get proper number of filtered features:
dropouts_filter <- dropouts != 0 & dropouts != 100
Wish the authors can explain more on this issue.
Hello,
I use scmap to annotate cell types based on a reference annotation dataset. The reference annotation dataset was downloaded from celldex. However, I encounter an error when I choose HumanPrimaryCellAtlasData. The function of selectFeatures can run properly when I choose DatabaseImmuneCellExpressionData. Unlucky, I want to use HumanPrimaryCellAtlasData for future analysis.
There is the code that does not work properly(HumanPrimaryCellAtlasData):
The error is
The same code can be run rightly if I use DatabaseImmuneCellExpressionData
I have changed logcounts to counts and expanded value by power 10, but that did not work.
I speculate the error is due to the value of logcounts. This is the logcounts value on HumanPrimaryCellAtlasData which report an error
This is the logcounts value on DatabaseImmuneCellExpressionData which can run properly.
Could you please help me solve this problem?
Thanks