FelixTheStudent / cellpypes

Cell type pipes for R
GNU General Public License v3.0
51 stars 3 forks source link

cellpypes does not handle objects processed with SoupX #18

Closed kaizen89 closed 2 years ago

kaizen89 commented 2 years ago

Starting with Seurat objects where the raw matrices are processed with SoupX, giving counts with decimals.

Here's my script
```obj <- list(
    raw      = SeuratObject::GetAssayData(lympho_NKT, "counts"),
    neighbors=as(lympho_NKT@graphs[["CSS_PCA"]], "dgCMatrix")>.1, 
    embed    =FetchData(lympho_NKT, vars=c("umapCSS_1","umapCSS_1")),
    totalUMI = lympho_NKT$nCount_RNA
)
pype <- obj %>%
  rule("T",           "CD3E",    ">", 2)                  %>% 
  rule("CD8+ T",      "CD8A",    ">", 1,  parent="T") 

I get this error:

Error in check_obj(obj) : s and as.integer(s) are not equal
Mean relative difference: 0.0001429285
FelixTheStudent commented 2 years ago

Thanks for posting! Welcome to cellpypes.

The error message looks like your counts are not integers (1,2,3,…) but decimals (1.2,1.9,…). I recommend using uncorrected raw UMI counts instead, i.e. read in the Cellranger output again and use that. I think uncorrected counts should work well: Ambient RNA contamination can be expected to affect all cell types the same and cellpypes‘ neighbour pooling should help to dampen its effects.

That is what I recommend as statistician. If you just want to test cellpypes on this very data, you can convert SoupX output to integer counts and the error message should go away. Option 1 for this: The SoupX authors recommend using ,roundToInt=TRUE‘ when you run SoupX correction with the ,adjustCounts‘ function. Option 2 for this quick&dirty approach: simply use ,round(SeuratObject::GetAssayData(lympho_NKT, "counts"))‘ in slot raw. Again, this option 2 is really to test cellpypes, not an analysis I would go for as rigorous scientist.

Did this make the error go away?