plger / scDblFinder

Methods for detecting doublets in single-cell sequencing data
https://plger.github.io/scDblFinder/
GNU General Public License v3.0
153 stars 18 forks source link

Does ‘Size factors should be positive’ error matter? #96

Closed YiweiNiu closed 7 months ago

YiweiNiu commented 7 months ago

Hi, I found several samples with the error "Error in calculating norm factors:Error in .local(x, ...): size factors should be positive", but scDblFinder still successfully outputed the results. I was wondering if this error affected the result. Thanks!

plger commented 7 months ago

Can you provide a traceback() of the error?

YiweiNiu commented 7 months ago

Hey, thanks for your quick reply.

Here is the warning message. It's just warnings, and scDblFinder could run successfully.

Warning messages:
1: In .checkSCE(sce) :
  Some cells in `sce` have an extremely low read counts; note that these could trigger errors and might best be filtered out
2: In value[[3L]](cond) :
  Error in calculating norm factors:Error in .local(x, ...): size factors should be positive

I converted the warnings to errors and here is the output of traceback()

> traceback()
8: doWithOneRestart(return(expr), restart)
7: withOneRestart(expr, restarts[[1L]])
6: withRestarts({
       .Internal(.signalCondition(simpleWarning(msg, call), msg, 
           call))
       .Internal(.dfltWarn(msg, call))
   }, muffleWarning = function() NULL)
5: .signalSimpleWarning("Some cells in `sce` have an extremely low read counts; note that these could trigger errors and might best be filtered out", 
       base::quote(.checkSCE(sce)))
4: warning("Some cells in `sce` have an extremely low read counts; note ", 
       "that these could trigger errors and might best be filtered out")
3: .checkSCE(sce)
2: scDblFinder(as.SingleCellExperiment(srat))
1: as.Seurat(scDblFinder(as.SingleCellExperiment(srat)))
plger commented 7 months ago

Thanks for the details. Yes, it matters: the normalization error was turned into a warning for cases where it's unavoidable, but what this means is that for the purpose of finding doublets, the data will not be normalized. Typically this works surprisingly well (for reasons I won't get into), but it's not optimal. In general I would strongly recommend that you filter out cells that have too few counts, these are clearly more trouble than they're worth. Pierre-Luc

YiweiNiu commented 7 months ago

Thanks very much for the explanations! It helps a lot.