alexyermanos / Platypus

R package for the analysis of single-cell immune repertoires
GNU General Public License v3.0
41 stars 16 forks source link

Data loading via VDJ_GEX_matrix() #10

Closed xiachenrui closed 2 years ago

xiachenrui commented 2 years ago

Hi, I meet a problem when use VDJ_GEX_matrix() loading my data could you please help me, thanks a lot!

##############################################

-----------------code-----------------------

############################################## library(Platypus) library(Seurat) library(tidyverse) library(utils) library(scales) library(pals) library(ggrepel) library(igraph) VDJ.out.directory.list_B <- list() VDJ.out.directory.list_B[[1]] <- c("/data/vdj_bcr_1/outs/") # sample1 VDJ.out.directory.list_B[[2]] <- c("/data/vdj_bcr_2/outs/") # sample2

GEX.out.directory.list <- list() GEX.out.directory.list[[1]] <- c("/data/count_1/outs/") # sample1 GEX.out.directory.list[[2]] <- c("/data/count_2/outs") # sample2

vgm_b <- VDJ_GEX_matrix(VDJ.out.directory.list = VDJ.out.directory.list_B,
GEX.out.directory.list = GEX.out.directory.list, exclude.on.cell.state.markers = c("CD3G+", "CD3E+", "CD4+", "CD8A+","CD8B1+","C1QA+", "C1QB+","RORA+","NKG7+"),

Strict filtering to achieve best possible clustering of B cells

                    group.id = c("sample1","sample2")) 

##############################################

-------------error----------------------

############################################## Loading in data 2022-02-18 07:49:04 Loaded VDJ data 2022-02-18 07:49:07 Setting GEX directory to provided path/filtered_feature_bc_matrix Loaded GEX data

[1] "2022-02-18 07:49:53 UTC" Getting VDJ GEX stats Starting with 1 of 2... Getting lookup tables... Starting with 2 of 2... Getting lookup tables... Getting 10x stats Done with VDJ_GEX_stats Got VDJ GEX stats

[1] "2022-02-18 07:49:56 UTC" For sample 1: 14247 cell assigned barcodes in GEX, 2160 cell assigned high confidence barcodes in VDJ. Overlap: 2061 For sample 2: 11131 cell assigned barcodes in GEX, 1710 cell assigned high confidence barcodes in VDJ. Overlap: 1678

Removed a total of 217 cells with non unique barcodes in GEX Removed a total of 3 cells with non unique barcodes in VDJ

In GEX sample 1 excluded 5039 cells based on CD3G>0 ....... In GEX sample 2 failed to exclude cells based on: NKG7>0 Please check gene spelling

Starting VDJ barcode iteration 1 of 2...

[1] "2022-02-18 07:50:11 UTC" Done with 1 of 2 2022-02-18 07:50:31 Starting VDJ barcode iteration 2 of 2... [1] "2022-02-18 07:50:31 UTC" Done with 2 of 2

2022-02-18 07:50:44 Starting GEX pipeline Integrating GEX matrices using the default scale.data function. Other options are 'sct', 'anchors' (recommended in case of batch effects) and 'harmony' (recommended for large datasets)

Error in names(new.idents) <- rownames(x = combined.meta.data) : 'names' attribute [4] must be the same length as the vector [2]

##############################################

-------------traceback()----------------------

############################################## 4: merge.Seurat(GEX.merged, y = GEX.list[[i]], add.cell.ids = c("", "")) 3: merge(GEX.merged, y = GEX.list[[i]], add.cell.ids = c("", "")) 2: GEX_automate_single(GEX.list = gex.list, GEX.integrate = GEX.integrate, integration.method = integration.method, VDJ.gene.filter = VDJ.gene.filter, mito.filter = mito.filter, norm.scale.factor = norm.scale.factor, n.feature.rna = n.feature.rna, n.count.rna.min = n.count.rna.min, n.count.rna.max = n.count.rna.max, n.variable.features = n.variable.features, cluster.resolution = cluster.resolution, neighbor.dim = neighbor.dim, mds.dim = mds.dim, group.id = group.id, verbose = verbose) 1: VDJ_GEX_matrix(VDJ.out.directory.list = VDJ.out.directory.list_B, GEX.out.directory.list = GEX.out.directory.list, exclude.on.cell.state.markers = c("CD3G+", "CD3E+", "CD4+", "CD8A+", "CD8B1+", "C1QA+", "C1QB+", "RORA+", "NKG7+"), group.id = c("sample1","sample2"))

vickreiner commented 2 years ago

Hi @xiachenrui, thanks for letting us know! I was not able to reproduce the error. Given that the datasets that you are using are quite large, have you checked whether this works with smaller datasets and therefore may be a memory issue. Otherwise, what version of Seurat are you running. Possibly an update to the newest available would fix this. Please let me know if you still run into this issue. Thanks!

xiachenrui commented 2 years ago

Thansk for your kind reply! I think it not caused by memory shortage because I have at least 200GB free memory. And I use the seurat 4.0.0 , is it suitable for Platypus 3.3.2 ?


R version 4.0.3 (2020-10-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04 LTS

Matrix products: default BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats4 grid stats graphics grDevices utils datasets methods
[10] base

other attached packages: [1] Platypus_3.3.2 scales_1.1.1 forcats_0.5.0
[4] stringr_1.4.0 purrr_0.3.4 tidyr_1.1.2
[7] tibble_3.0.4 tidyverse_1.3.0 DropletUtils_1.10.3
[10] SeuratObject_4.0.0 Seurat_4.0.0 rhdf5_2.34.0
[13] MAST_1.16.0 scater_1.18.3 SingleCellExperiment_1.12.0 [16] SummarizedExperiment_1.20.0 Biobase_2.50.0 GenomicRanges_1.42.0
[19] GenomeInfoDb_1.26.1 IRanges_2.24.0 S4Vectors_0.28.0
[22] BiocGenerics_0.36.0 MatrixGenerics_1.2.0 matrixStats_0.61.0
[25] SC3_1.18.0 hdf5r_1.3.3 umap_0.2.7.0
[28] shinyFiles_0.9.1 webshot_0.5.2 readr_1.4.0
[31] pheatmap_1.0.12 shinycssloaders_1.0.0 RCircos_1.2.2
[34] VennDiagram_1.6.20 futile.logger_1.4.3 DT_0.16
[37] shinyjs_2.1.0 plyr_1.8.6 plotly_4.9.2.1
[40] immunarch_0.6.6 patchwork_1.1.0 data.table_1.14.2
[43] dtplyr_1.0.1 dplyr_1.0.2 ggplot2_3.3.5
[46] shiny_1.5.0

loaded via a namespace (and not attached): [1] scattermore_0.7 prabclus_2.3-2 R.methodsS3_1.8.1
[4] bit64_4.0.5 R.utils_2.10.1 irlba_2.3.5
[7] DelayedArray_0.16.0 rpart_4.1-15 RCurl_1.98-1.2
[10] doParallel_1.0.16 generics_0.1.0 cowplot_1.1.0
[13] lambda.r_1.2.4 RANN_2.6.1 future_1.20.1
[16] bit_4.0.4 xml2_1.3.2 lubridate_1.7.9
[19] spatstat.data_2.1-0 httpuv_1.5.4 assertthat_0.2.1
[22] viridis_0.6.1 xfun_0.19 hms_0.5.3
[25] promises_1.1.1 DEoptimR_1.0-8 fansi_0.4.1
[28] dbplyr_2.0.0 readxl_1.3.1 DBI_1.1.0
[31] igraph_1.2.10 htmlwidgets_1.5.2 ellipsis_0.3.1
[34] RSpectra_0.16-0 ggpubr_0.4.0 backports_1.2.0
[37] deldir_0.2-3 sparseMatrixStats_1.2.0 vctrs_0.3.5
[40] ggalluvial_0.12.3 ROCR_1.0-11 abind_1.4-5
[43] withr_2.3.0 robustbase_0.93-7 sctransform_0.3.2
[46] mclust_5.4.7 goftest_1.2-2 cluster_2.1.0
[49] lazyeval_0.2.2 crayon_1.3.4 edgeR_3.32.0
[52] pkgconfig_2.0.3 nlme_3.1-149 vipor_0.4.5
[55] nnet_7.3-14 rlang_0.4.12 globals_0.14.0
[58] diptest_0.75-7 lifecycle_0.2.0 miniUI_0.1.1.1
[61] modelr_0.1.8 rsvd_1.0.3 cellranger_1.1.0
[64] polyclip_1.10-0 lmtest_0.9-38 rngtools_1.5
[67] Matrix_1.3-2 ggseqlogo_0.1 carData_3.0-4
[70] Rhdf5lib_1.12.0 zoo_1.8-8 reprex_0.3.0
[73] beeswarm_0.2.3 ggridges_0.5.2 GlobalOptions_0.1.2
[76] png_0.1-7 viridisLite_0.4.0 bitops_1.0-6
[79] R.oo_1.24.0 KernSmooth_2.23-17 rhdf5filters_1.2.0
[82] DelayedMatrixStats_1.12.1 doRNG_1.8.2 shape_1.4.5
[85] parallelly_1.21.0 rstatix_0.6.0 ggsignif_0.6.0
[88] beachmat_2.6.2 magrittr_2.0.1 ica_1.0-2
[91] zlibbioc_1.36.0 compiler_4.0.3 dqrng_0.2.1
[94] tinytex_0.27 factoextra_1.0.7 RColorBrewer_1.1-2
[97] rrcov_1.6-2 fitdistrplus_1.1-3 cli_2.2.0
[100] XVector_0.30.0 listenv_0.8.0 pbapply_1.4-3
[103] formatR_1.7 mgcv_1.8-33 MASS_7.3-54
[106] tidyselect_1.1.0 stringi_1.5.3 locfit_1.5-9.4
[109] BiocSingular_1.6.0 askpass_1.1 ggrepel_0.8.2
[112] tools_4.0.3 future.apply_1.6.0 rio_0.5.16
[115] circlize_0.4.12 rstudioapi_0.13 foreach_1.5.1
[118] foreign_0.8-80 gridExtra_2.3 Rtsne_0.15
[121] digest_0.6.27 fpc_2.2-9 Rcpp_1.0.7
[124] car_3.0-10 broom_0.7.2 scuttle_1.0.3
[127] later_1.1.0.1 RcppAnnoy_0.0.18 WriteXLS_6.3.0
[130] httr_1.4.2 kernlab_0.9-29 colorspace_2.0-0
[133] rvest_0.3.6 tensor_1.5 fs_1.5.0
[136] reticulate_1.18 splines_4.0.3 uwot_0.1.9
[139] spatstat.utils_2.1-0 shinythemes_1.2.0 flexmix_2.3-17
[142] xtable_1.8-4 jsonlite_1.7.1 futile.options_1.0.1
[145] spatstat_1.64-1 UpSetR_1.4.0 modeltools_0.2-23
[148] R6_2.5.0 pillar_1.4.7 htmltools_0.5.1.1
[151] mime_0.9 glue_1.4.2 fastmap_1.0.1
[154] BiocParallel_1.24.1 BiocNeighbors_1.8.1 class_7.3-17
[157] codetools_0.2-16 pcaPP_1.9-74 mvtnorm_1.1-1
[160] lattice_0.20-41 curl_4.3 ggbeeswarm_0.6.0
[163] leiden_0.3.5 zip_2.1.1 openxlsx_4.2.3
[166] openssl_1.4.3 limma_3.46.0 survival_3.2-7
[169] munsell_0.5.0 e1071_1.7-4 GenomeInfoDbData_1.2.4
[172] iterators_1.0.13 HDF5Array_1.18.0 haven_2.3.1
[175] reshape2_1.4.4 gtable_0.3.0

xiachenrui commented 2 years ago

I plan to test on another server, and I will share my progress

vickreiner commented 2 years ago

Thanks for the sessionInfo. I can't see any faults or outdated packages related to this function there, so this should not be the issue. Thanks for testing it on another server.