dynverse / dynwrap

A common data format and inference environment for single-cell trajectories 📦
https://dynverse.org
Other
15 stars 7 forks source link

wrap_expression function reverse feature and cell ids from Seurat objects #162

Open celinech opened 3 years ago

celinech commented 3 years ago

Hi devs!!

I am running into a problem where the feature_ids and cell_ids are reversed from a Seurat object source. I ran the first lines to prepare the data as explain in the Quick start of dynverse.

SO <- readRDS("my_file.rds")

dataset <- wrap_expression(
    counts = SO@assays[["RNA"]]@counts,
    expression = SO@assays[["RNA"]]@data
)

guidelines <- guidelines_shiny(dataset)

I tried to use the function provided in the issue dynverse/dynwrap #150 . Unfortunately, the counts and data information are not stored into a matrix, so the transpose function does not work.

> str(SO@assays[["RNA"]]@counts)
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:113749606] 2 3 4 5 9 13 17 22 24 27 ...
  ..@ p       : int [1:21886] 0 6676 11901 17770 24860 31899 37467 43675 49359 52809 ...
  ..@ Dim     : int [1:2] 23913 21885
  ..@ Dimnames:List of 2
  .. ..$ : chr [1:23913] "Xkr4" "Rp1" "Sox17" "Mrpl15" ...
  .. ..$ : chr [1:21885] "GAS_Day4_batch4_AAACCCATCGACTCCT" "GAS_Day4_batch4_AAACGAAGTCATAACC" "GAS_Day4_batch4_AAACGCTCAAGCGAGT" "GAS_Day4_batch4_AAACGCTCACCCTAAA" ...
  ..@ x       : num [1:113749606] 5 9 1 1 2 2 1 2 7 7 ...
  ..@ factors : list()

In summary, I have in my dataset

For comparison, after wrap_expression() function I run into a dataset with

as shown in the dynguidelines, when I do

> guidelines <- guidelines_shiny(dataset)

Capture d’écran de 2021-10-22 14-22-12

 > str(dataset)
List of 8
 $ id               : chr "20211022_112041__data_wrapper__rrmsZuUNII"
 $ cell_ids         : chr [1:23913] "Xkr4" "Rp1" "Sox17" "Mrpl15" ...
 $ cell_info        : tbl_df [23,913 Ă— 1] (S3: tbl_df/tbl/data.frame)
  ..$ cell_id: chr [1:23913] "Xkr4" "Rp1" "Sox17" "Mrpl15" ...
 $ feature_ids      : chr [1:21885] "GAS_Day4_batch4_AAACCCATCGACTCCT" "GAS_Day4_batch4_AAACGAAGTCATAACC" "GAS_Day4_batch4_AAACGCTCAAGCGAGT" "GAS_Day4_batch4_AAACGCTCACCCTAAA" ...
 $ feature_info     : tbl_df [21,885 Ă— 1] (S3: tbl_df/tbl/data.frame)
  ..$ feature_id: chr [1:21885] "GAS_Day4_batch4_AAACCCATCGACTCCT" "GAS_Day4_batch4_AAACGAAGTCATAACC" "GAS_Day4_batch4_AAACGCTCAAGCGAGT" "GAS_Day4_batch4_AAACGCTCACCCTAAA" ...
 $ counts           :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. ..@ i       : int [1:113749606] 2 3 4 5 9 13 17 22 24 27 ...
  .. ..@ p       : int [1:21886] 0 6676 11901 17770 24860 31899 37467 43675 49359 52809 ...
  .. ..@ Dim     : int [1:2] 23913 21885
  .. ..@ Dimnames:List of 2
  .. .. ..$ : chr [1:23913] "Xkr4" "Rp1" "Sox17" "Mrpl15" ...
  .. .. ..$ : chr [1:21885] "GAS_Day4_batch4_AAACCCATCGACTCCT" "GAS_Day4_batch4_AAACGAAGTCATAACC" "GAS_Day4_batch4_AAACGCTCAAGCGAGT" "GAS_Day4_batch4_AAACGCTCACCCTAAA" ...
  .. ..@ x       : num [1:113749606] 5 9 1 1 2 2 1 2 7 7 ...
  .. ..@ factors : list()
 $ expression       :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. ..@ i       : int [1:113749606] 2 3 4 5 9 13 17 22 24 27 ...
  .. ..@ p       : int [1:21886] 0 6676 11901 17770 24860 31899 37467 43675 49359 52809 ...
  .. ..@ Dim     : int [1:2] 23913 21885
  .. ..@ Dimnames:List of 2
  .. .. ..$ : chr [1:23913] "Xkr4" "Rp1" "Sox17" "Mrpl15" ...
  .. .. ..$ : chr [1:21885] "GAS_Day4_batch4_AAACCCATCGACTCCT" "GAS_Day4_batch4_AAACGAAGTCATAACC" "GAS_Day4_batch4_AAACGCTCAAGCGAGT" "GAS_Day4_batch4_AAACGCTCACCCTAAA" ...
  .. ..@ x       : num [1:113749606] 0.663 0.99 0.172 0.172 0.319 ...
  .. ..@ factors : list()
 $ expression_future: NULL
 - attr(*, "class")= chr [1:3] "dynwrap::with_expression" "dynwrap::data_wrapper" "list"

I wonder if the trajectory inference methods will well retrieve the information for the step after in infer_trajectory() function. Does someone know how to get information properly from a Seurat object ?

Thanks, CĂ©line

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /opt/anaconda3/envs/rossi/lib/libblas.so.3.8.0
LAPACK: /opt/anaconda3/envs/rossi/lib/liblapack.so.3.8.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] shiny_1.6.0         forcats_0.5.1       stringr_1.4.0       dplyr_1.0.7         purrr_0.3.3        
 [6] readr_1.4.0         tidyr_1.1.3         tibble_3.0.3        ggplot2_3.2.1       tidyverse_1.2.1    
[11] dyno_0.1.2          dynwrap_1.2.2       dynplot_1.1.0       dynmethods_1.0.5    dynguidelines_1.0.1
[16] dynfeature_1.0.0   

loaded via a namespace (and not attached):
  [1] readxl_1.3.1       backports_1.2.1    dyndimred_1.0.4    babelwhale_1.0.3   plyr_1.8.6        
  [6] igraph_1.2.6       lazyeval_0.2.2     sp_1.4-5           proxyC_0.2.1       splines_3.6.1     
 [11] listenv_0.8.0      digest_0.6.27      foreach_1.5.1      htmltools_0.5.1.1  viridis_0.5.1     
 [16] fansi_0.4.2        magrittr_2.0.1     carrier_0.1.0      cluster_2.1.0      ROCR_1.0-11       
 [21] remotes_2.3.0      globals_0.14.0     graphlayouts_0.7.1 modelr_0.1.8       RcppParallel_5.1.4
 [26] R.utils_2.10.1     dynutils_1.0.9     colorspace_2.0-2   rvest_1.0.0        ggrepel_0.8.1     
 [31] rbibutils_2.1.1    haven_2.4.1        crayon_1.4.1       jsonlite_1.7.2     hexbin_1.28.1     
 [36] zoo_1.8-9          survival_3.2-11    iterators_1.0.13   ape_5.5            glue_1.4.2        
 [41] polyclip_1.10-0    gtable_0.3.0       leiden_0.3.8       future.apply_1.7.0 dynparam_1.0.2    
 [46] scales_1.1.1       DBI_1.1.1          Rcpp_1.0.7         metap_1.1          viridisLite_0.3.0 
 [51] xtable_1.8-4       reticulate_1.20    rsvd_1.0.3         akima_0.6-2.2      SDMTools_1.1-221.2
 [56] tsne_0.1-3         htmlwidgets_1.5.3  httr_1.4.2         RColorBrewer_1.1-2 ellipsis_0.3.2    
 [61] Seurat_3.1.1       ica_1.0-2          pkgconfig_2.0.3    R.methodsS3_1.8.1  farver_2.1.0      
 [66] sass_0.4.0         uwot_0.1.5         utf8_1.2.1         tidyselect_1.1.1   labeling_0.4.2    
 [71] rlang_0.4.11       reshape2_1.4.4     later_1.2.0        cachem_1.0.5       munsell_0.5.0     
 [76] cellranger_1.1.0   tools_3.6.1        cli_2.5.0          generics_0.1.0     ranger_0.13.1     
 [81] broom_0.7.6        ggridges_0.5.3     fastmap_1.1.0      yaml_2.2.1         processx_3.5.2    
 [86] fitdistrplus_1.1-3 tidygraph_1.2.0    lmds_0.1.0         RANN_2.6.1         ggraph_2.0.5      
 [91] pbapply_1.4-3      future_1.21.0      nlme_3.1-150       mime_0.10          GA_3.2.2          
 [96] R.oo_1.24.0        xml2_1.3.2         compiler_3.6.1     rstudioapi_0.13    plotly_4.9.1      
[101] png_0.1-7          testthat_3.0.2     tweenr_1.0.2       bslib_0.2.5.1      stringi_1.4.6     
[106] ps_1.6.0           desc_1.3.0         lattice_0.20-44    Matrix_1.2-18      shinyjs_2.0.0     
[111] vctrs_0.3.8        pillar_1.6.1       lifecycle_1.0.0    jquerylib_0.1.4    Rdpack_2.1.1      
[116] lmtest_0.9-38      RcppAnnoy_0.0.18   data.table_1.14.0  cowplot_1.1.1      irlba_2.3.3       
[121] httpuv_1.6.1       patchwork_1.1.1    R6_2.5.0           promises_1.2.0.1   KernSmooth_2.23-18
[126] gridExtra_2.3      vipor_0.4.5        parallelly_1.25.0  codetools_0.2-18   MASS_7.3-54       
[131] assertthat_0.2.1   rprojroot_2.0.2    shinyWidgets_0.6.2 withr_2.4.2        sctransform_0.2.0 
[136] parallel_3.6.1     hms_1.1.0          grid_3.6.1         waldo_0.2.5        Rtsne_0.15        
[141] ggforce_0.3.3      lubridate_1.7.10 
celinech commented 3 years ago

Hi !

I managed to work with dynwrap passing from a sparse matrix to a complete matrix. Then doing the transposition to retrieve the right orientation of my data.

SO <- readRDS("my_file.rds")

tSO_counts <- t(as.matrix(SO@assays$RNA@counts))
tSO_expres <- t(as.matrix(SO@assays$RNA@data))

dataset <- wrap_expression(
    counts = tSO_counts,
    expression = tSO_expres
)

guidelines <- guidelines_shiny(dataset)

Capture d’écran de 2021-10-26 11-18-58

looking forward to a better solution, Thank you for the easy to use package !