constantAmateur / SoupX

R package to quantify and remove cell free mRNAs from droplet based scRNA-seq data
253 stars 34 forks source link

Error correcting expression profile using adjustCounts #53

Closed fhermans27 closed 3 years ago

fhermans27 commented 3 years ago

Hi, thanks for the nice tool to correct ambient RNA expression!

I've been following the vignette, and everything has been going well up till the correction of the expression profile using adjustCounts, rho was estimated fine.

However upon running adjustCounts I get the following error message:

Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Warning message in tgt * ws <= bucketLims: "longer object length is not a multiple of shorter object length" Warning message in bucketLims/ws: "longer object length is not a multiple of shorter object length" Error: Matrices must have same dimensions in e1 - Matrix(e2) Traceback:

  1. adjustCounts(sc)
  2. adjustCounts(tmp, clusters = FALSE, method = method, roundToInt = FALSE, . verbose = verbose, tol = tol, pCut = pCut)
  3. out - do.call(cbind, lapply(seq(ncol(out)), function(e) alloc(expSoupCnts[e], . out[, e], soupFrac)))
  4. out - do.call(cbind, lapply(seq(ncol(out)), function(e) alloc(expSoupCnts[e], . out[, e], soupFrac)))
  5. callGeneric(e1, Matrix(e2))
  6. eval(call, parent.frame())
  7. eval(call, parent.frame())
  8. e1 - Matrix(e2)
  9. e1 - Matrix(e2)
  10. dimCheck(e1, e2)
  11. stop(gettextf("Matrices must have same dimensions in %s", deparse(sys.call(sys.parent()))), . call. = FALSE, domain = NA)

I don't know where the problem lies unfortunately... Any help troubleshooting would be appreciated.

I tried checking the structure of my CellSoup object (using str(cs)) and got the following output:

List of 6 $ toc : num [1:22390, 1:2951] 0 0 0 0 0 0 1 0 0 0 ... ..- attr(, "dimnames")=List of 2 .. ..$ : chr [1:22390] "Xkr4" "Gm1992" "Rp1" "Sox17" ... .. ..$ : chr [1:2951] "EXT145_AAACCTGCACGGCGTT" "EXT145_AAACCTGCAGACGCAA" "EXT145_AAACCTGCATCGGAAG" "EXT145_AAACCTGTCATGGTCA" ... $ metaData :'data.frame': 2951 obs. of 5 variables: ..$ nUMIs : num [1:2951] 4837 4906 13139 5049 5500 ... ..$ clusters: chr [1:2951] "Cortico" "Lacto" "EC2" "Lacto" ... ..$ UMAP_1 : num [1:2951] 8.93 1.68 10.17 3.26 0.7 ... ..$ UMAP_2 : num [1:2951] -8.53 1.15 3.89 -3.95 -2.23 ... ..$ rho : num [1:2951] 0.018 0.018 0.018 0.018 0.018 0.018 0.018 0.018 0.018 0.018 ... $ nDropUMIs : Named num [1:737280] 7 0 1 1 1 1 15 1 1 43 ... ..- attr(, "names")= chr [1:737280] "AAACCTGAGAAACCAT-1" "AAACCTGAGAAACCGC-1" "AAACCTGAGAAACCTA-1" "AAACCTGAGAAACGAG-1" ... $ soupProfile:'data.frame': 28692 obs. of 2 variables: ..$ est : num [1:28692] 1.47e-06 1.47e-06 0.00 0.00 0.00 ... ..$ counts: num [1:28692] 4 4 0 0 0 13 0 82 72 0 ... $ DR : chr [1:2] "UMAP_1" "UMAP_2" $ fit :List of 6 ..$ dd :'data.frame': 1300 obs. of 14 variables: .. ..$ gene : Factor w/ 100 levels "1810011O10Rik",..: 37 98 30 25 66 13 89 51 8 11 ... .. ..$ passNonExp : logi [1:1300] TRUE TRUE TRUE TRUE TRUE TRUE ... .. ..$ rhoEst : num [1:1300] 0 0 0.0435 0 0 ... .. ..$ rhoIdx : int [1:1300] 1 1 8 1 1 1 7 1 8 1 ... .. ..$ obsCnt : num [1:1300] 0 0 1 0 0 0 2 0 1 0 ... .. ..$ expCnt : num [1:1300] 32.4 37.6 23 34.1 33.4 ... .. ..$ isExpressedFDR: num [1:1300] 1.54e-11 2.57e-13 4.39e-07 3.97e-12 7.00e-12 ... .. ..$ geneIdx : int [1:1300] 6 7 8 12 13 14 16 20 25 29 ... .. ..$ tfidf : num [1:1300] 3.45 3.45 3.42 3.36 3.36 ... .. ..$ soupIdx : int [1:1300] 1414 1211 2027 1345 1371 1982 557 1115 2075 1478 ... .. ..$ soupExp : num [1:1300] 8.22e-05 9.54e-05 5.83e-05 8.66e-05 8.47e-05 ... .. ..$ useEst : logi [1:1300] TRUE TRUE TRUE TRUE TRUE TRUE ... .. ..$ rhoHigh : num [1:1300] 0.114 0.0982 0.2426 0.1082 0.1106 ... .. ..$ rhoLow : num [1:1300] 0 0 0.0011 0 0 ... ..$ priorRho : num 0.05 ..$ priorRhoStdDev: num 0.1 ..$ posterior : num [1:1001] 0 5.96 7.25 8.05 8.64 ... ..$ rhoEst : num 0.018 ..$ rhoFWHM : num [1:2] 0.01 0.043

  • attr(*, "class")= chr [1:2] "list" "SoupChannel"

At first glance everything looks fine to me? Googling the error message didn't help me further..

Thanks!

trueth1206 commented 3 years ago

Hi, thank you for sharing this. I've got the same error and hope someone can kindly help. Many thanks.

constantAmateur commented 3 years ago

I can't tell from the error trace what the problem is likely to be. Are you both using the latest version of SoupX on CRAN? If not could you update to the newest version by running install.packages('SoupX') and see if that fixes the problem.

If the error persists, would you be able to save your SoupChannel object using saveRDS(sc,sc_file.RDS) and share it with. I can then investigate further what the problem might be.

paulineriviere commented 3 years ago

Hi! Thank you for the very useful package. I have got the same error running adjustCounts with the latest version of SoupX (installed today). I have my SoupChannel object saved, how can I share it with you? Thank you in advance for your help!

constantAmateur commented 3 years ago

If this is not too far in the past to still be useful, you can email a link to constantamateur gmail com

constantAmateur commented 3 years ago

Many thanks to @paulineriviere for sending me an example file.

The root cause of this error seems to be when there are a different number of genes in the empty droplets (for cellranger raw_feature_bc_matrix) and the cells (filtered_feature_bc_matrix).

I'll implement an automatic check/fix for this in the next version, but for now if you run the below before adjustCounts this error should be fixed: sc$soupProfile = sc$soupProfile[rownames(sc$toc),]

I'll leave this issue open for now. If the above fix doesn't work for anyone please let me know.

constantAmateur commented 3 years ago

v1.5.1 will now check and raise an error if the table of counts and droplets doesn't use the same genes, which should prevent this issue in future.