satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 912 forks source link

Possible bug in subset() for multimodel objects #5148

Closed TerezaClarence closed 2 years ago

TerezaClarence commented 3 years ago

Hello,

I am experiencing issue with subsetting similar to previously reported here: https://github.com/satijalab/seurat/issues/4069, however reinstalling as suggested did not help in my case.

Thank you very much in advance for looking into this.

> snRNA_MDT_fi
An object of class Seurat 
140080 features across 12251 samples within 2 assays 
Active assay: RNA (36601 features, 0 variable features)
 1 other assay present: ATAC

> colnames(snRNA_MDT_fi@meta.data)
 [1] "orig.ident"       "nCount_RNA"       "nFeature_RNA"     "nCount_ATAC"      "nFeature_ATAC"    "sex"             
 [7] "age"              "percent.mt"       "mitoRatio"        "percent.ribo"     "riboRatio"        "percent.hb"      
[13] "log10GenesPerUMI"

> subset(x = snRNA_MDT_fi, subset = `orig.ident` != 'MDT')
Error in FetchData(object = object, vars = unique(x = expr.char[vars.use]),  : 

> subset(x = snRNA_MDT_fi, subset = orig.ident != 'MDT')
Error in FetchData(object = object, vars = unique(x = expr.char[vars.use]),  : 
  None of the requested variables were found: 
> packageVersion('Signac')
[1] '1.4.0.9000'
> packageVersion('Seurat')
[1] '4.0.4'
> sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
 [1] parallel  stats4    grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] biomaRt_2.46.3                    GSEABase_1.52.1                   graph_1.68.0                     
 [4] annotate_1.68.0                   scRNAseq_2.4.0                    scran_1.18.7                     
 [7] SingleCellExperiment_1.12.0       celldex_1.0.0                     SingleR_1.4.1                    
[10] SummarizedExperiment_1.20.0       MatrixGenerics_1.2.1              matrixStats_0.61.0               
[13] readxl_1.3.1                      remotes_2.4.0                     BSgenome.Hsapiens.UCSC.hg38_1.4.3
[16] Matrix_1.3-4                      cerebroViz_1.0                    XML_3.99-0.8                     
[19] EnsDb.Hsapiens.v86_2.99.0         ensembldb_2.14.1                  AnnotationFilter_1.14.0          
[22] GenomicFeatures_1.42.3            AnnotationDbi_1.52.0              cowplot_1.1.1                    
[25] hdf5r_1.3.4                       patchwork_1.1.1                   Signac_1.4.0.9000                
[28] SpectralTAD_1.6.0                 SeuratObject_4.0.2                Seurat_4.0.4                     
[31] genphen_1.18.0                    Rcpp_1.0.7                        gwascat_2.22.0                   
[34] GWASTools_1.36.0                  Biobase_2.50.0                    OmicCircos_1.28.0                
[37] Gviz_1.34.1                       ggmap_3.0.0                       reshape2_1.4.4                   
[40] rsvg_2.1.2                        RIdeogram_0.2.2                   ggVennDiagram_1.1.4              
[43] colorspace_2.0-2                  BSgenome_1.58.0                   rtracklayer_1.50.0               
[46] Biostrings_2.58.0                 XVector_0.30.0                    iheatmapr_0.5.1                  
[49] karyoploteR_1.16.0                regioneR_1.22.0                   GenomicRanges_1.42.0             
[52] GenomeInfoDb_1.26.7               IRanges_2.24.1                    S4Vectors_0.28.1                 
[55] BiocGenerics_0.36.1               data.table_1.14.2                 NbClust_3.0                      
[58] ggrepel_0.9.1                     ggdendro_0.1.22                   dendextend_1.15.1                
[61] cluster_2.1.2                     forcats_0.5.1                     stringr_1.4.0                    
[64] purrr_0.3.4                       readr_2.0.1                       tibble_3.1.4                     
[67] tidyverse_1.3.1                   bio3d_2.4-2                       lattice_0.20-44                  
[70] gridExtra_2.3                     ggridges_0.5.3                    ggplot2movies_0.0.1              
[73] tidyr_1.1.4                       ComplexHeatmap_2.6.2              RColorBrewer_1.1-2               
[76] corrplot_0.90                     pheatmap_1.0.12                   d3heatmap_0.6.1.3                
[79] factoextra_1.0.7                  gplots_3.1.1                      dplyr_1.0.7                      
[82] ggpubr_0.4.0                      ggplot2_3.3.5                    

loaded via a namespace (and not attached):
  [1] pbapply_1.5-0                 haven_2.4.3                   vctrs_0.3.8                  
  [4] V8_3.4.2                      usethis_2.0.1                 mgcv_1.8-36                  
  [7] blob_1.2.2                    survival_3.2-13               spatstat.data_2.1-0          
 [10] later_1.3.0                   DBI_1.1.1                     rappdirs_0.3.3               
 [13] uwot_0.1.10                   dqrng_0.3.0                   gdsfmt_1.26.1                
 [16] jpeg_0.1-9                    zlibbioc_1.36.0               MatrixModels_0.5-0           
 [19] htmlwidgets_1.5.4             GlobalOptions_0.1.2           future_1.22.1                
 [22] inline_0.3.19                 leiden_0.3.9                  irlba_2.3.3                  
 [25] KernSmooth_2.23-20            promises_1.2.0.1              limma_3.46.0                 
 [28] DelayedArray_0.16.3           pkgload_1.2.2                 RcppParallel_5.1.4           
 [31] Hmisc_4.5-0                   fs_1.5.0                      fastmatch_1.1-3              
 [34] ranger_0.13.1                 digest_0.6.28                 png_0.1-7                    
 [37] bluster_1.0.0                 qlcMatrix_0.9.7               sctransform_0.3.2            
 [40] pkgconfig_2.0.3               docopt_0.7.1                  DelayedMatrixStats_1.12.3    
 [43] iterators_1.0.13              reticulate_1.22               rstan_2.21.2                 
 [46] circlize_0.4.13               GetoptLong_1.0.5              xfun_0.26                    
 [49] zoo_1.8-9                     tidyselect_1.1.1              DNAcopy_1.64.0               
 [52] ica_1.0-2                     operator.tools_1.6.3          GWASExactHW_1.01             
 [55] viridisLite_0.4.0             pkgbuild_1.2.0                rlang_0.4.11                 
 [58] RVenn_1.1.0                   glue_1.4.2                    modelr_0.1.8                 
 [61] ggseqlogo_0.1                 ggsignif_0.6.3                SparseM_1.81                 
 [64] httpuv_1.6.3                  class_7.3-19                  BiocNeighbors_1.8.2          
 [67] quantsmooth_1.56.0            jsonlite_1.7.2                bit_4.0.4                    
 [70] mime_0.12                     Rsamtools_2.6.0               stringi_1.7.4                
 [73] processx_3.5.2                RcppRoll_0.3.0                spatstat.sparse_2.0-0        
 [76] scattermore_0.7               bitops_1.0-7                  cli_3.0.1                    
 [79] RSQLite_2.2.8                 rstudioapi_0.13               logistf_1.24                 
 [82] GenomicAlignments_1.26.0      nlme_3.1-153                  fastcluster_1.2.3            
 [85] locfit_1.5-9.4                VariantAnnotation_1.36.0      listenv_0.8.0                
 [88] SnowballC_0.7.0               miniUI_0.1.1.1                dbplyr_2.1.1                 
 [91] sessioninfo_1.1.1             lifecycle_1.0.1               ExperimentHub_1.16.1         
 [94] munsell_0.5.0                 cellranger_1.1.0              caTools_1.18.2               
 [97] codetools_0.2-18              lmtest_0.9-38                 htmlTable_2.2.1              
[100] xtable_1.8-4                  ROCR_1.0-11                   BiocManager_1.30.16          
[103] StanHeaders_2.21.0-7          abind_1.4-5                   farver_2.1.0                 
[106] parallelly_1.28.1             AnnotationHub_2.22.1          RANN_2.6.1                   
[109] askpass_1.1                   biovizBase_1.38.0             sparsesvd_0.2                
[112] RcppAnnoy_0.0.19              goftest_1.2-2                 dichromat_2.0-0              
[115] future.apply_1.8.1            ellipsis_0.3.2                prettyunits_1.1.1            
[118] rPref_1.3                     lubridate_1.7.10              reprex_2.0.1                 
[121] igraph_1.2.6                  slam_0.1-48                   testthat_3.0.4               
[124] spatstat.utils_2.2-0          htmltools_0.5.2               BiocFileCache_1.14.0         
[127] yaml_2.2.1                    loo_2.4.1                     interactiveDisplayBase_1.28.0
[130] utf8_1.2.2                    plotly_4.9.4.1                e1071_1.7-9                  
[133] scuttle_1.0.4                 foreign_0.8-81                withr_2.4.2                  
[136] fitdistrplus_1.1-6            BiocParallel_1.24.1           bit64_4.0.5                  
[139] foreach_1.5.1                 ProtGenerics_1.22.0           spatstat.core_2.3-0          
[142] rsvd_1.0.5                    devtools_2.4.2                bamsignals_1.22.0            
[145] memoise_2.0.0                 rio_0.5.27                    tzdb_0.1.2                   
[148] callr_3.7.0                   ps_1.6.0                      curl_4.3.2                   
[151] fansi_0.5.0                   edgeR_3.32.1                  tensor_1.5                   
[154] conquer_1.0.2                 checkmate_2.0.0               cachem_1.0.6                 
[157] desc_1.3.0                    deldir_0.2-10                 rjson_0.2.20                 
[160] openxlsx_4.2.4                rstatix_0.7.0                 clue_0.3-59                  
[163] rprojroot_2.0.2               tools_4.0.4                   sandwich_3.0-1               
[166] magrittr_2.0.1                RCurl_1.98-1.5                proxy_0.4-26                 
[169] mice_3.13.0                   car_3.0-11                    bezier_1.1.2                 
[172] xml2_1.3.2                    httr_1.4.2                    assertthat_0.2.1             
[175] globals_0.14.0                R6_2.5.1                      nnet_7.3-16                  
[178] progress_1.2.2                pbdZMQ_0.3-5                  statmod_1.4.36               
[181] gtools_3.9.2                  shape_1.4.6                   lsa_0.73.2                   
[184] grImport2_0.2-0               beachmat_2.6.4                BiocVersion_3.12.0           
[187] BiocSingular_1.6.0            splines_4.0.4                 carData_3.0-4                
[190] generics_0.1.0                base64enc_0.1-3               pillar_1.6.3                 
[193] tweenr_1.0.2                  sp_1.4-5                      GenomeInfoDbData_1.2.4       
[196] plyr_1.8.6                    gtable_0.3.0                  rvest_1.0.1                  
[199] zip_2.2.0                     RgoogleMaps_1.4.5.3           knitr_1.34                   
[202] latticeExtra_0.6-29           fastmap_1.1.0                 Cairo_1.5-12.2               
[205] doParallel_1.0.16             quantreg_5.86                 broom_0.7.9                  
[208] openssl_1.4.5                 scales_1.1.1                  backports_1.2.1              
[211] hms_1.1.0                     ggforce_0.3.3                 Rtsne_0.15                   
[214] formula.tools_1.7.1           shiny_1.7.0                   polyclip_1.10-0              
[217] snpStats_1.40.0               lazyeval_0.2.2                Formula_1.2-4                
[220] crayon_1.4.1                  MASS_7.3-54                   sparseMatrixStats_1.2.1      
[223] viridis_0.6.1                 rpart_4.1-15                  compiler_4.0.4               
[226] spatstat.geom_2.2-2 
samuel-marsh commented 3 years ago

Hi,

Not member of dev team but hopefully can be helpful. The error you are receiving seems to be different than the one reported in that issue. Can you post what the output of this code is for your object:

unique(snRNA_MDT_fi@meta.data$orig.ident)

Best, Sam

TerezaClarence commented 3 years ago
> unique(snRNA_MDT_fi@meta.data$orig.ident)
[1] "11-MDT" "100-MDT" "MDT" "CN"
TerezaClarence commented 3 years ago

Hello,

I am sorry, here is the correct issue https://github.com/satijalab/seurat/issues/1647 Thank you for any help!

Best, Tereza

samuel-marsh commented 3 years ago

Hi Tereza,

So I'm not sure what's causing the error in this case as the error you are receiving is like what occurs if you try and subset on a meta.data variable that doesn't exist but orig.ident always exists as does "MDT". I'm unfortunately unable to reproduce issue so not really sure what is going on but might be specific to your Seurat Object (or I'm missing something hopefully the devs can pick up on).

Just curious though what happens if you run:

subset_obj <- subset(x = snRNA_MDT_fi, subset = orig.ident %in% "MDT", invert = TRUE)

best, Sam

TerezaClarence commented 3 years ago

Thank you very much. Unfortunately, this still does not work for me. I did try to reinstall Seurat and Signac, and reboot the RStudio, however the issue persists for any subset() from metadata (not only ori.ident). Also, I have noticed, that I am unable to reproduce this tutorial: https://satijalab.org/signac/articles/pbmc_multiomic.html#linking-peaks-to-genes-1

Quality control section

DefaultAssay(pbmc) <- "ATAC"

pbmc <- NucleosomeSignal(pbmc)
pbmc <- TSSEnrichment(pbmc)

pbmc <- subset(
  x = pbmc,
  subset = nCount_ATAC < 100000 &
    nCount_RNA < 25000 &
    nCount_ATAC > 1000 &
    nCount_RNA > 1000 &
    nucleosome_signal < 2 &
    TSS.enrichment > 1
)
pbmc

I end up with only 1 assay present instead of both RNA and ATAC (I do experience similar problems with my own multimodal datasets). If I omit this section:

DefaultAssay(pbmc) <- "ATAC"

pbmc <- NucleosomeSignal(pbmc)
pbmc <- TSSEnrichment(pbmc)

and simply subset by:

pbmc <- subset(
  x = pbmc,
  subset = nCount_ATAC < 100000 &
    nCount_RNA < 25000 &
    nCount_ATAC > 1000 &
    nCount_RNA > 1000)
pbmc

I am still able to have 2 assays (RNA, ATAC). I suspect there might be something specific with latest version of Signac? Thank you so much for looking into this, much appreciated!

Best, Tereza

timoast commented 3 years ago

@TerezaClarence I could not reproduce this issue

What is the output of head(pbmc)?

no-response[bot] commented 2 years ago

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.