plger / scDblFinder

Methods for detecting doublets in single-cell sequencing data
https://plger.github.io/scDblFinder/
GNU General Public License v3.0
153 stars 18 forks source link

error in scATAC #95

Closed LidiaML closed 6 months ago

LidiaML commented 8 months ago

Dear developers,

I am trying to run scDblFinder to identify doublets in scATAC data processed with seurat and signac. I tried to use the same parameters you provide in the scATAC vignette but got an error during the "Training model" step. Any hint?

Congratulations for the work and many thanks in advance!

MRE -- Minimal example to reproduce the bug scDblFinder(sce, clusters=NULL, artificialDoublets=1, aggregateFeatures=TRUE, nfeatures=25, processing="normFeatures"))

Traceback Aggregating features... Creating ~4033 artificial doublets... Dimensional reduction Evaluating kNN... Training model... Error in if (length(expected) > 1 && x > min(expected) && x < max(expected)) return(0) : missing value where TRUE/FALSE needed Session info ` R version 4.1.3 (2022-03-10) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Rocky Linux 8.8 (Green Obsidian)

Matrix products: default BLAS/LAPACK: /home/lmateo/anaconda3/envs/R/lib/libopenblasp-r0.3.21.so

locale: [1] LC_CTYPE=C LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0 [3] MatrixGenerics_1.6.0 matrixStats_0.62.0 [5] scDblFinder_1.8.0 Rmagic_2.0.3 [7] future_1.26.1 RibosomalQC_0.1.0 [9] forcats_0.5.2 stringr_1.4.1 [11] purrr_0.3.5 readr_2.1.3 [13] tidyr_1.2.0 tibble_3.1.8 [15] tidyverse_1.3.1 gridExtra_2.3 [17] ggpubr_0.4.0 ggplot2_3.3.6 [19] dplyr_1.0.10 BSgenome.Mmusculus.UCSC.mm10_1.4.3 [21] BSgenome_1.62.0 rtracklayer_1.54.0 [23] Biostrings_2.62.0 XVector_0.34.0 [25] EnsDb.Mmusculus.v79_2.99.0 ensembldb_2.18.1 [27] AnnotationFilter_1.18.0 GenomicFeatures_1.46.1 [29] AnnotationDbi_1.56.2 Biobase_2.54.0 [31] GenomicRanges_1.46.1 GenomeInfoDb_1.30.1 [33] IRanges_2.28.0 S4Vectors_0.32.4 [35] BiocGenerics_0.40.0 Signac_1.8.0 [37] sp_1.5-1 SeuratObject_4.1.0 [39] Seurat_4.1.1 hwriter_1.3.2.1 [41] Matrix_1.5-3

loaded via a namespace (and not attached): [1] rappdirs_0.3.3 scattermore_0.8 [3] bit64_4.0.5 irlba_2.3.5.1 [5] DelayedArray_0.20.0 data.table_1.14.4 [7] rpart_4.1.19 KEGGREST_1.34.0 [9] RCurl_1.98-1.9 generics_0.1.3 [11] ScaledMatrix_1.2.0 cowplot_1.1.1 [13] RSQLite_2.2.18 RANN_2.6.1 [15] bit_4.0.4 tzdb_0.3.0 [17] spatstat.data_3.0-0 xml2_1.3.3 [19] lubridate_1.8.0 httpuv_1.6.6 [21] assertthat_0.2.1 viridis_0.6.2 [23] hms_1.1.2 promises_1.2.0.1 [25] fansi_1.0.3 restfulr_0.0.15 [27] progress_1.2.2 dbplyr_2.2.1 [29] readxl_1.4.1 igraph_1.3.5 [31] DBI_1.1.3 htmlwidgets_1.5.4 [33] spatstat.geom_3.0-3 ellipsis_0.3.2 [35] backports_1.4.1 biomaRt_2.50.0 [37] deldir_1.0-6 sparseMatrixStats_1.6.0 [39] vctrs_0.5.0 here_1.0.1 [41] ROCR_1.0-11 abind_1.4-5 [43] cachem_1.0.6 withr_2.5.0 [45] progressr_0.11.0 sctransform_0.3.5 [47] GenomicAlignments_1.30.0 scran_1.22.1 [49] prettyunits_1.1.1 goftest_1.2-3 [51] cluster_2.1.4 lazyeval_0.2.2 [53] crayon_1.5.2 edgeR_3.36.0 [55] pkgconfig_2.0.3 nlme_3.1-160 [57] vipor_0.4.5 ProtGenerics_1.26.0 [59] rlang_1.0.6 globals_0.16.1 [61] lifecycle_1.0.3 miniUI_0.1.1.1 [63] filelock_1.0.2 BiocFileCache_2.2.0 [65] modelr_0.1.10 rsvd_1.0.5 [67] cellranger_1.1.0 rprojroot_2.0.3 [69] polyclip_1.10-4 lmtest_0.9-40 [71] carData_3.0-5 zoo_1.8-11 [73] reprex_2.0.2 beeswarm_0.4.0 [75] ggridges_0.5.3 png_0.1-7 [77] viridisLite_0.4.0 rjson_0.2.21 [79] bitops_1.0-7 KernSmooth_2.23-20 [81] tidyseurat_0.5.3 blob_1.2.3 [83] DelayedMatrixStats_1.16.0 parallelly_1.32.1 [85] spatstat.random_3.0-1 rstatix_0.7.1 [87] ggsignif_0.6.4 beachmat_2.10.0 [89] scales_1.2.1 memoise_2.0.1 [91] magrittr_2.0.3 plyr_1.8.8 [93] ica_1.0-3 zlibbioc_1.40.0 [95] compiler_4.1.3 dqrng_0.3.0 [97] BiocIO_1.4.0 RColorBrewer_1.1-3 [99] fitdistrplus_1.1-8 Rsamtools_2.10.0 [101] cli_3.4.1 listenv_0.8.0 [103] patchwork_1.1.1 pbapply_1.5-0 [105] MASS_7.3-58.1 mgcv_1.8-41 [107] tidyselect_1.2.0 stringi_1.7.8 [109] yaml_2.3.6 locfit_1.5-9.6 [111] BiocSingular_1.10.0 ggrepel_0.9.1 [113] grid_4.1.3 fastmatch_1.1-3 [115] tools_4.1.3 future.apply_1.9.0 [117] rstudioapi_0.14 bluster_1.4.0 [119] metapod_1.2.0 Rtsne_0.16 [121] digest_0.6.30 rgeos_0.5-9 [123] shiny_1.7.3 Rcpp_1.0.9 [125] car_3.1-1 broom_1.0.1 [127] scuttle_1.4.0 later_1.2.0 [129] RcppAnnoy_0.0.20 httr_1.4.4 [131] colorspace_2.0-3 rvest_1.0.3 [133] XML_3.99-0.12 fs_1.5.2 [135] tensor_1.5 reticulate_1.26 [137] splines_4.1.3 statmod_1.4.37 [139] uwot_0.1.14 RcppRoll_0.3.0 [141] spatstat.utils_3.0-1 scater_1.22.0 [143] xgboost_1.7.4.1 plotly_4.10.1 [145] xtable_1.8-4 jsonlite_1.8.3 [147] R6_2.5.1 pillar_1.8.1 [149] htmltools_0.5.3 mime_0.12 [151] glue_1.6.2 fastmap_1.1.0 [153] BiocParallel_1.28.3 BiocNeighbors_1.12.0 [155] codetools_0.2-18 utf8_1.2.2 [157] lattice_0.20-45 spatstat.sparse_3.0-0 [159] curl_4.3.3 ggbeeswarm_0.7.2 [161] leiden_0.4.3 colorRamps_2.3.1 [163] limma_3.50.3 survival_3.4-0 [165] ttservice_0.1.2 munsell_0.5.0 [167] GenomeInfoDbData_1.2.7 haven_2.5.1 [169] reshape2_1.4.4 gtable_0.3.1 [171] spatstat.core_2.4-4 `

plger commented 8 months ago

Hi, You're using an old version of scDblFinder (and, it seems, of bioconductor generally): you're at version 1.8, the current release version is 1.16. As far as I remember the ATAC variants were not even officially supported back then. Please try updating to more recent versions. I'd recommend updating the whole bioconductor, because the single-cell field has been moving fast, but if for some reason you can't, you should be able to simply install scDblFinder from github using BiocManager::install("plger/scDblFinder").