satijalab / seurat-data

Dataset distribution for Seurat
GNU General Public License v3.0
129 stars 48 forks source link

Request: Examples of Seurat datasets v2 & v4 #37

Closed evanbiederstedt closed 3 years ago

evanbiederstedt commented 3 years ago

Hi @mojaveazure @timoast @andrewwbutler

Hope all is well.

For our own internal development purposes, I was hoping to access examples of Seurat v2, v3, & v4 from this package.

            Dataset seurat
 1:          bmcite  3.2.2
 2:            cbmc  3.1.4
 3: celegans.embryo   <NA>
 4:        hcabm40k   <NA>
 5:            ifnb   <NA>
 6:           panc8   <NA>
 7:          pbmc3k  3.1.4
 8:    pbmcMultiome  4.0.0
 9:         pbmcsca   <NA>
10:         ssHippo   <NA>
11:        stxBrain   <NA>
12:       stxKidney   <NA>
13:     thp1.eccite   <NA>

It looks like there are several for v3, which is great.

Could we get a few more examples for version 2 and version 4 datasets?

A related question is, is there documentation detailing the differences between these Seurat objects? It would be useful to clarify to myself the differences between Seurat version used as software and the Seurat objects the users have saved (if there are differences).

My understanding is that there's more or less a correspondence between v2 and v3 objects, based on the section "Seurat v2.X vs v3.X" here: https://satijalab.org/seurat/articles/essential_commands.html

(Apologies if much of this is documented, and I missed it...)

Part of the reason asking is related to this: https://github.com/kharchenkolab/conos/issues/101 https://github.com/satijalab/seurat-wrappers/blob/master/docs/conos.md

Based on v3, it appears this works:

InstallData("ifnb")
data("ifnb")
ifnb.panel <- SplitObject(ifnb, split.by = "stim")
for (i in 1:length(ifnb.panel)) {
    ifnb.panel[[i]] <- NormalizeData(ifnb.panel[[i]]) %>% FindVariableFeatures() %>% ScaleData() %>% 
        RunPCA(verbose = FALSE)
}
ifnb.con <- Conos$new(ifnb.panel)
ifnb.con$buildGraph(k = 15, k.self = 5, space = "PCA", ncomps = 30, n.odgenes = 2000, matching.method = "mNN", 
    metric = "angular", score.component.variance = TRUE, verbose = TRUE)
ifnb.con$findCommunities()
ifnb.con$embedGraph()
ifnb <- as.Seurat(ifnb.con)
DimPlot(ifnb, reduction = "largeVis", group.by = c("stim", "ident", "seurat_annotations"), ncol = 3)

With the same dataset accessed with v4, it fails.

Thank you for any help with this. Best, Evan

andrewwbutler commented 3 years ago

Hi Evan,

We don't actively support v2 anymore and would recommend just updating the Seurat objects to the latest version using UpdateSeuratObject. For v4 objects, there is very little in terms of the object structure that changed but similarly, you should be able to run UpdateSeuratObject on any of the v3 objects there. You can find some documentation on the object structure here.

As far as the seurat-wrappers example, can you clarify the cases where it works and doesn't work? It looks like that vignette was last built July 2019 (probably corresponding to conos v.1.2.0) so it's possible that updates to either conos or Seurat could have affected it since. However, here's a reprex that seems to work for me using the latest Seurat v4.

library(Seurat)
#> Attaching SeuratObject
suppressWarnings(library(SeuratData))
#> Registered S3 method overwritten by 'cli':
#>   method     from         
#>   print.boxx spatstat.geom
#> ── Installed datasets ───────────────────────────────────── SeuratData v0.2.1 ──
#> ✓ ifnb     3.1.0                        ✓ pbmc3k   3.1.4
#> ✓ panc8    3.0.2                        ✓ stxBrain 0.1.1
#> ────────────────────────────────────── Key ─────────────────────────────────────
#> ✓ Dataset loaded successfully
#> > Dataset built with a newer version of Seurat than installed
#> ❓ Unknown version of Seurat installed
library(SeuratWrappers)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(conos)
#> Loading required package: Matrix
#> Loading required package: igraph
#> 
#> Attaching package: 'igraph'
#> The following objects are masked from 'package:dplyr':
#> 
#>     as_data_frame, groups, union
#> The following objects are masked from 'package:stats':
#> 
#>     decompose, spectrum
#> The following object is masked from 'package:base':
#> 
#>     union
data("ifnb")
ifnb.panel <- SplitObject(ifnb, split.by = "stim")
for (i in 1:length(ifnb.panel)) {
  ifnb.panel[[i]] <- NormalizeData(ifnb.panel[[i]]) %>% FindVariableFeatures() %>% ScaleData() %>% 
    RunPCA(verbose = FALSE)
}
#> Centering and scaling data matrix
#> Centering and scaling data matrix
ifnb.con <- Conos$new(ifnb.panel)
ifnb.con$buildGraph(k = 15, k.self = 5, space = "PCA", ncomps = 30, n.odgenes = 2000, matching.method = "mNN", 
                    metric = "angular", score.component.variance = TRUE, verbose = TRUE)
#> found 0 out of 1 cached PCA space pairs ...
#> running 1 additional PCA space pairs
#> Warning in scaledMatricesSeuratV3(so.objs = samples, data.type = data.type, :
#> Seurat doesn't support variance scaling
#> .
#>  done
#> inter-sample links using mNN
#> Warning in scaledMatricesSeuratV3(so.objs = samples, data.type = data.type, :
#> Seurat doesn't support variance scaling
#> .
#>  done
#> local pairs
#>  done
#> building graph .
#> .
#> done
ifnb.con$findCommunities()
ifnb.con$embedGraph()
#> Estimating embeddings.
ifnb <- as.Seurat(ifnb.con)
#> Merging 2 samples
#> Adding pairwise alignments to 'conos.pairs' in miscellaneous data
#> Adding graph as 'RNA_mnn'
#> Warning: Adding a Graph without an assay associated with it
#> Adding graph embedding as largeVis
#> Adding clustering information
DimPlot(ifnb, reduction = "largeVis", group.by = c("stim", "ident", "seurat_annotations"), ncol = 3)

sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] conos_1.4.1               igraph_1.2.6             
#>  [3] Matrix_1.3-4              dplyr_1.0.6              
#>  [5] SeuratWrappers_0.3.0      stxBrain.SeuratData_0.1.1
#>  [7] pbmc3k.SeuratData_3.1.4   panc8.SeuratData_3.0.2   
#>  [9] ifnb.SeuratData_3.1.0     SeuratData_0.2.1         
#> [11] SeuratObject_4.0.2        Seurat_4.0.3             
#> 
#> loaded via a namespace (and not attached):
#>   [1] N2R_0.1.1             circlize_0.4.12       backports_1.2.1      
#>   [4] plyr_1.8.6            lazyeval_0.2.2        splines_4.1.0        
#>   [7] listenv_0.8.0         scattermore_0.7       ggplot2_3.3.4        
#>  [10] digest_0.6.27         foreach_1.5.1         htmltools_0.5.1.1    
#>  [13] fansi_0.5.0           magrittr_2.0.1        tensor_1.5           
#>  [16] cluster_2.1.2         doParallel_1.0.16     ROCR_1.0-11          
#>  [19] remotes_2.3.0         ComplexHeatmap_2.8.0  globals_0.14.0       
#>  [22] matrixStats_0.59.0    spatstat.sparse_2.0-0 sccore_0.1.3         
#>  [25] colorspace_2.0-1      rappdirs_0.3.3        ggrepel_0.9.1        
#>  [28] xfun_0.24             crayon_1.4.1          jsonlite_1.7.2       
#>  [31] spatstat.data_2.1-0   survival_3.2-11       zoo_1.8-9            
#>  [34] iterators_1.0.13      glue_1.4.2            polyclip_1.10-0      
#>  [37] gtable_0.3.0          leiden_0.3.8          GetoptLong_1.0.5     
#>  [40] leidenAlg_0.1.1       shape_1.4.6           future.apply_1.7.0   
#>  [43] BiocGenerics_0.38.0   abind_1.4-5           scales_1.1.1         
#>  [46] DBI_1.1.1             miniUI_0.1.1.1        Rcpp_1.0.6           
#>  [49] viridisLite_0.4.0     xtable_1.8-4          clue_0.3-59          
#>  [52] reticulate_1.20       spatstat.core_2.1-2   rsvd_1.0.5           
#>  [55] stats4_4.1.0          htmlwidgets_1.5.3     httr_1.4.2           
#>  [58] RColorBrewer_1.1-2    ellipsis_0.3.2        ica_1.0-2            
#>  [61] farver_2.1.0          pkgconfig_2.0.3       uwot_0.1.10          
#>  [64] deldir_0.2-10         utf8_1.2.1            labeling_0.4.2       
#>  [67] tidyselect_1.1.1      rlang_0.4.11          reshape2_1.4.4       
#>  [70] later_1.2.0           pbmcapply_1.5.0       munsell_0.5.0        
#>  [73] tools_4.1.0           cli_2.5.0             generics_0.1.0       
#>  [76] ggridges_0.5.3        evaluate_0.14         stringr_1.4.0        
#>  [79] fastmap_1.1.0         yaml_2.2.1            goftest_1.2-2        
#>  [82] knitr_1.33            fs_1.5.0              fitdistrplus_1.1-5   
#>  [85] purrr_0.3.4           RANN_2.6.1            pbapply_1.4-3        
#>  [88] future_1.21.0         nlme_3.1-152          mime_0.10            
#>  [91] grr_0.9.5             compiler_4.1.0        rstudioapi_0.13      
#>  [94] plotly_4.9.3          png_0.1-7             spatstat.utils_2.1-0 
#>  [97] reprex_2.0.0          tibble_3.1.2          stringi_1.6.2        
#> [100] highr_0.9             lattice_0.20-44       styler_1.4.1         
#> [103] vctrs_0.3.8           pillar_1.6.1          lifecycle_1.0.0      
#> [106] BiocManager_1.30.16   GlobalOptions_0.1.2   spatstat.geom_2.1-0  
#> [109] lmtest_0.9-38         RcppAnnoy_0.0.18      data.table_1.14.0    
#> [112] cowplot_1.1.1         irlba_2.3.3           Matrix.utils_0.9.8   
#> [115] httpuv_1.6.1          patchwork_1.1.1       R6_2.5.0             
#> [118] promises_1.2.0.1      KernSmooth_2.23-20    gridExtra_2.3        
#> [121] IRanges_2.26.0        parallelly_1.25.0     codetools_0.2-18     
#> [124] MASS_7.3-54           assertthat_0.2.1      rjson_0.2.20         
#> [127] withr_2.4.2           sctransform_0.3.2     S4Vectors_0.30.0     
#> [130] mgcv_1.8-36           parallel_4.1.0        grid_4.1.0           
#> [133] rpart_4.1-15          tidyr_1.1.3           rmarkdown_2.8        
#> [136] Cairo_1.5-12.2        Rtsne_0.15            shiny_1.6.0

Created on 2021-06-17 by the reprex package (v2.0.0)

evanbiederstedt commented 3 years ago

Hi @andrewwbutler

Thanks for the help!

We don't actively support v2 anymore and would recommend just updating the Seurat objects to the latest version using UpdateSeuratObject. For v4 objects, there is very little in terms of the object structure that changed but similarly, you should be able to run UpdateSeuratObject on any of the v3 objects there. You can find some documentation on the object structure here.

Thanks! After digging around, I think I understand the Seurat classes a bit better now---there's a version field we can access.

Here is version 2: https://github.com/satijalab/seurat/blob/65b77a9480281ef9ab1aa8816f7c781752092c18/R/seurat.R#L8-L71

seurat <- setClass(
  "seurat",
  slots = c(
    raw.data = "ANY",
    data = "ANY",
    scale.data = "ANY",
    var.genes = "vector",
    is.expr = "numeric",
    ident = "factor",
    meta.data = "data.frame",
    project.name = "character",
    dr = "list",
    assay = "list",
    hvg.info = "data.frame",
    imputed = "data.frame",
    cell.names = "vector",
    cluster.tree = "list",
    snn = "dgCMatrix",
    calc.params = "list",
    kmeans = "ANY",
    spatial = "ANY",
    misc = "ANY",
    version = "ANY"
  )
)

The changelog then details the modifications with v3: https://satijalab.org/seurat/news/index.html#seurat-3-0-0-2019-04-16-2019-04-15 (We've been looking around for details of the v3 -> v4 changes)

Here's the v3 class: https://github.com/satijalab/seurat/blob/25b830b0dd6f12538516f0faf9a3b3ddfd0ce6d8/R/objects.R#L237

Seurat <- setClass(
  Class = 'Seurat',
  slots = c(
    assays = 'list',
    meta.data = 'data.frame',
    active.assay = 'character',
    active.ident = 'factor',
    graphs = 'list',
    neighbors = 'list',
    reductions = 'list',
    project.name = 'character',
    misc = 'list',
    version = 'package_version',
    commands = 'list',
    tools = 'list'
  )
)

You can find some documentation on the object structure here.

Ah, so this page refers to v3 and v4: https://github.com/satijalab/seurat/wiki/Seurat#slots

That clarifies things, thank you

RE: UpdateSeuratObject()

I guess one could argue that this function should be run on all *rds "Seurat" object given...

As far as the seurat-wrappers example, can you clarify the cases where it works and doesn't work? It looks like that vignette was last built July 2019 (probably corresponding to conos v.1.2.0) so it's possible that updates to either conos or Seurat could have affected it since. However, here's a reprex that seems to work for me using the latest Seurat v4.

This also clarifies some internal confusion on our end. You're using the latest version of conos....and Seurat v4. I'll track down precisely what is going on and update kharchenkolab/conos#101

I appreciate the help here! Best, Evan

andrewwbutler commented 3 years ago

No problem. Happy to help out if there's anything else we can clarify or needs to be updated on the seurat-wrappers end to better integrate with the latest conos.