constantAmateur / SoupX

R package to quantify and remove cell free mRNAs from droplet based scRNA-seq data
254 stars 35 forks source link

plate-based scRNAseq data #3

Closed davidsebfischer closed 6 years ago

davidsebfischer commented 6 years ago

Hi SoupX team!

  1. I would like to use SoupX on plate-based scRNAseq where we included empty wells (so that we can use them similar to empty droplets). I guess there is no reason to assume that the model would not work on this type of data?

  2. Attempting this, I called the SoupChannelList constructor directly with my data (as the 10x input handling function is not appropriate here) and I got an error in inferNonExpressedGenes().

rm(list=ls())
library(Matrix)
library(Seurat)
library(SoupX)
dir_in <- my path
dir_our <- my path
cnts <- t(readMM(paste0(dir_in, "counts_adata_proc.mtx")))
obs <- read.csv(paste0(dir_in, "obs_adata_proc.csv"), as.is = TRUE)
chips <- unique(obs$chip_id) # The plates.
iscell <- Matrix::colSums(cnts) >= 5000 # Cut off for empty wells (this is higher than in 10x data).
scl <- SoupChannelList(
       channels=lapply(chips[1:2], function(chip){
       SoupChannel(tod = as.matrix(cnts[,obs$chip_id==chip]),
                 toc = as.matrix(cnts[,obs$chip_id==chip & iscell]),
                 channelName = chip,
                 soupRange = c(0,5000), # Cut off for empty wells (this is higher than in 10x data).
                 keepDroplets = TRUE)
   }))
scl <- inferNonExpressedGenes(scl)

Inferring non-expressed genes for channel chip1 Error in split.default(rat@x, rownames(rat)[rat@i + 1]) : group length is 0 but data length > 0

Do you have any intuition as to why this could happen? I cannot share the data unfortunately. Session info for this example:

sessionInfo()

R version 3.5.0 (2018-04-23) Platform: x86_64-apple-darwin17.5.0 (64-bit) Running under: macOS High Sierra 10.13.4 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] SoupX_0.2.3 Seurat_2.3.2 cowplot_0.9.2 ggplot2_2.2.1 Matrix_1.2-14 loaded via a namespace (and not attached): [1] diffusionMap_1.1-0 Rtsne_0.13 VGAM_1.0-5 colorspace_1.3-2 ggridges_0.5.0
[6] class_7.3-14 modeltools_0.2-21 mclust_5.4 htmlTable_1.11.2 base64enc_0.1-3
[11] proxy_0.4-22 rstudioapi_0.7 DRR_0.0.3 bit64_0.9-7 flexmix_2.3-14
[16] prodlim_2018.04.18 mvtnorm_1.0-8 lubridate_1.7.4 ranger_0.10.1 codetools_0.2-15
[21] splines_3.5.0 R.methodsS3_1.7.1 mnormt_1.5-5 robustbase_0.93-0 knitr_1.20
[26] tclust_1.4-1 RcppRoll_0.3.0 jsonlite_1.5 Formula_1.2-3 caret_6.0-80
[31] ica_1.0-2 broom_0.4.4 ddalpha_1.3.3 cluster_2.0.7-1 kernlab_0.9-26
[36] png_0.1-7 R.oo_1.22.0 sfsmisc_1.1-2 compiler_3.5.0 backports_1.1.2
[41] assertthat_0.2.0 lazyeval_0.2.1 lars_1.2 acepack_1.4.1 htmltools_0.3.6
[46] tools_3.5.0 bindrcpp_0.2.2 igraph_1.2.1 gtable_0.2.0 glue_1.2.0
[51] RANN_2.5.1 reshape2_1.4.3 dplyr_0.7.4 Rcpp_0.12.16 trimcluster_0.1-2
[56] gdata_2.18.0 ape_5.1 nlme_3.1-137 iterators_1.0.9 fpc_2.1-11
[61] lmtest_0.9-36 psych_1.8.4 timeDate_3043.102 gower_0.1.2 stringr_1.3.1
[66] irlba_2.3.2 gtools_3.5.0 DEoptimR_1.0-8 zoo_1.8-1 MASS_7.3-50
[71] scales_0.5.0 ipred_0.9-6 doSNOW_1.0.16 parallel_3.5.0 RColorBrewer_1.1-2
[76] yaml_2.1.19 reticulate_1.8 pbapply_1.3-4 gridExtra_2.3 segmented_0.5-3.0
[81] rpart_4.1-13 latticeExtra_0.6-28 stringi_1.2.2 foreach_1.4.4 checkmate_1.8.5
[86] caTools_1.17.1 lava_1.6.1 geometry_0.3-6 dtw_1.20-1 SDMTools_1.1-221
[91] rlang_0.2.0 pkgconfig_2.0.1 prabclus_2.2-6 bitops_1.0-6 lattice_0.20-35
[96] ROCR_1.0-7 purrr_0.2.4 bindr_0.1.1 recipes_0.1.2 htmlwidgets_1.2
[101] bit_1.1-13 tidyselect_0.2.4 CVST_0.2-2 plyr_1.8.4 magrittr_1.5
[106] R6_2.2.2 snow_0.4-2 gplots_3.0.1 Hmisc_4.1-1 dimRed_0.1.0
[111] withr_2.1.2 pillar_1.2.2 foreign_0.8-70 mixtools_1.1.0 fitdistrplus_1.0-9
[116] survival_2.42-3 scatterplot3d_0.3-41 abind_1.4-5 nnet_7.3-12 tsne_0.1-3
[121] tibble_1.4.2 hdf5r_1.0.0 KernSmooth_2.23-15 grid_3.5.0 data.table_1.11.2
[126] FNN_1.1 ModelMetrics_1.1.0 metap_0.9 digest_0.6.15 diptest_0.75-7
[131] tidyr_0.8.0 R.utils_2.6.0 stats4_3.5.0 munsell_0.4.3 magic_1.5-8

Thanks for your help! David

constantAmateur commented 6 years ago

Dear David,

I'm not sure what is causing the error you are receiving, I suspect it is to do with the low numbers of droplets where the code expects hundreds of thousands. However, I am not confident that the assumptions of the method will hold true for plate based data, so would not recommend trying to use our method on plate data.

So I'm afraid for the moment using SoupX on plate data is unsupported.