plger / scDblFinder

Methods for detecting doublets in single-cell sequencing data
https://plger.github.io/scDblFinder/
GNU General Public License v3.0
163 stars 17 forks source link

rbind error when running multiple samples #23

Closed drhochbaum closed 4 years ago

drhochbaum commented 4 years ago

Hello, I am able to run the developer version of scDblFinder with one sample, but when using an SCE with multiple samples (named in colData) I run into the following error (true whether I load in an SCE or matrix with a vector of sample IDs):

masterSCE = scDblFinder(sce = sce, samples = "sample_ID", nfeatures = 1000, score = 'xgb',verbose = TRUE) Error in .format_mismatch_message(x_colnames, object_colnames) : the DataFrame objects to rbind do not have the same column names ('ratio.k20' is unique)

sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] forcats_0.5.0 stringr_1.4.0 dplyr_1.0.2 purrr_0.3.4 readr_1.4.0 tidyr_1.1.2 tibble_3.0.3 ggplot2_3.3.2 tidyverse_1.3.0 DropletUtils_1.9.13 SingleCellExperiment_1.11.8 [12] SummarizedExperiment_1.19.9 Biobase_2.49.1 GenomicRanges_1.41.6 GenomeInfoDb_1.25.11 IRanges_2.23.10 S4Vectors_0.27.13 BiocGenerics_0.35.4 MatrixGenerics_1.1.3 matrixStats_0.57.0 scDblFinder_1.3.9

loaded via a namespace (and not attached): [1] ggbeeswarm_0.6.0 colorspace_1.4-1 ellipsis_0.3.1 rprojroot_1.3-2 scuttle_0.99.18 bluster_0.99.1 XVector_0.29.3 BiocNeighbors_1.7.0 fs_1.5.0 yaImpute_1.0-32 rstudioapi_0.11 remotes_2.2.0
[13] fansi_0.4.1 lubridate_1.7.9 xml2_1.3.2 R.methodsS3_1.8.1 scater_1.17.5 jsonlite_1.7.1 pROC_1.16.2 broom_0.7.1 dbplyr_1.4.4 R.oo_1.24.0 HDF5Array_1.17.13 BiocManager_1.30.10
[25] compiler_4.0.2 httr_1.4.2 dqrng_0.2.1 backports_1.1.10 assertthat_0.2.1 Matrix_1.2-18 limma_3.45.14 cli_2.0.2 BiocSingular_1.5.2 prettyunits_1.1.1 tools_4.0.2 rsvd_1.0.3
[37] igraph_1.2.5 gtable_0.3.0 glue_1.4.2 GenomeInfoDbData_1.2.4 Rcpp_1.0.5 cellranger_1.1.0 vctrs_0.3.4 rhdf5filters_1.1.3 DelayedMatrixStats_1.11.1 ps_1.3.4 rvest_0.3.6 beachmat_2.5.8
[49] lifecycle_0.2.0 irlba_2.3.3 statmod_1.4.34 edgeR_3.31.4 zlibbioc_1.35.0 scales_1.1.1 hms_0.5.3 rhdf5_2.33.10 yaml_2.2.1 curl_4.3 gridExtra_2.3 stringi_1.5.3
[61] scran_1.17.20 pkgbuild_1.1.0 BiocParallel_1.23.2 rlang_0.4.7 pkgconfig_2.0.3 bitops_1.0-6 lattice_0.20-41 Rhdf5lib_1.11.3 processx_3.4.4 tidyselect_1.1.0 plyr_1.8.6 magrittr_1.5
[73] R6_2.4.1 generics_0.0.2 DelayedArray_0.15.15 DBI_1.1.0 pillar_1.4.6 haven_2.3.1 withr_2.3.0 RCurl_1.98-1.2 modelr_0.1.8 crayon_1.3.4 intrinsicDimension_1.2.0 xgboost_1.2.0.1
[85] viridis_0.5.1 locfit_1.5-9.4 grid_4.0.2 readxl_1.3.1 data.table_1.13.0 blob_1.2.1 callr_3.4.4 reprex_0.3.0 R.utils_2.10.1 scds_1.5.0 munsell_0.5.0 beeswarm_0.2.3
[97] viridisLite_0.3.0 vipor_0.4.5

plger commented 4 years ago

Hi, thanks for reporting this. This is due to variations in the number of cells in each samples, but needless to say it shouldn't happen. I've pushed a fix to the devel branch (install with BiocManager::install("plger/scDblFinder", ref="devel") ), I'll push it to master & BioC once a few tests have run. Pierre-Luc

plger commented 4 years ago

now in the master branch, should be in Bioc devel in the next build.