tanaylab / metacell

Metacell - Single-cell mRNA Analysis
https://tanaylab.github.io/metacell
Other
105 stars 29 forks source link

mcell_mat_rpt_cor_anchors error #30

Closed NordinZandhuis closed 4 years ago

NordinZandhuis commented 4 years ago

Hi,

Thanks for developing this package and for providing the vignettes. They are really helpful.

I unfortunately ran into an error while running the 'supervised filtering of feature genes' vignette. When I want to generate the gene-gene correlation matrix I get the following error:

Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : invalid character indexing

I ran the other vignette prior to running this one, as described in the tutorial. I also noticed that when I replace the gene_anchors with other genes (like LAG3 or CD8B) it worked.

Could you help figuring out what causes this error?

Much appreciated,

Nordin

SeuratNewbie commented 4 years ago

Hi, I'm running into the same error......

amostanay commented 4 years ago

Unclear what exactly you were trying to run - perhaps you are using the gene names in the vignette (those cell cycle genes), but these are not in your umi matrix?

mat = scdb_mat("my_mat") setdiff(rownames(mat@mat), c('MKI67', 'PCNA', 'TOP2A', 'TXN', 'HSP90AB1', 'FOS'))

If yes - just replace the gene list with genes that you have in the data. We'll patch the function to protect against non-existing gene names.

Amos

SeuratNewbie commented 4 years ago

Thank you for your reply.

My system: MacBook Pro running Catalina 10.15.5 with 32 GB 2400 MHz

Running the vignette from: https://tanaylab.github.io/metacell/articles/a-basic_pbmc8k.html using data embedded in the supplied script (NOT my data as yet) as below:

library("metacell") if(!dir.exists("testdb")) dir.create("testdb/") scdb_init("testdb/", force_reinit=T) mcell_import_scmat_10x("test", base_dir="http://www.wisdom.weizmann.ac.il/~atanay/metac_data/pbmc_8k/") mat = scdb_mat("test") print(dim(mat@mat)) if(!dir.exists("figs")) dir.create("figs/") scfigs_init("figs/") mcell_plot_umis_per_cell("test") mat = scdb_mat("test") nms = c(rownames(mat@mat), rownames(mat@ignore_gmat)) ig_genes = c(grep("^IGJ", nms, v=T), grep("^IGH",nms,v=T), grep("^IGK", nms, v=T), grep("^IGL", nms, v=T)) bad_genes = unique(c(grep("^MT-", nms, v=T), grep("^MTMR", nms, v=T), grep("^MTND", nms, v=T),"NEAT1","TMSB4X", "TMSB10", ig_genes)) bad_genes mcell_mat_ignore_genes(new_mat_id="test", mat_id="test", bad_genes, reverse=F) mcell_add_gene_stat(gstat_id="test", mat_id="test", force=T) mcell_gset_filter_varmean(gset_id="test_feats", gstat_id="test", T_vm=0.08, force_new=T) mcell_gset_filter_cov(gset_id = "test_feats", gstat_id="test", T_tot=100, T_top3=2) mcell_plot_gstats(gstat_id="test", gset_id="test_feats") mcell_add_cgraph_from_mat_bknn(mat_id="test", gset_id = "test_feats", graph_id="test_graph", K=100, dsamp=T) mcell_coclust_from_graph_resamp( coc_id="test_coc500", graph_id="test_graph", min_mc_size=20, p_resamp=0.75, n_resamp=500) mcell_mc_from_coclust_balanced( coc_id="test_coc500", mat_id= "test", mc_id= "test_mc", K=30, min_mc_size=30, alpha=2) mcell_plot_outlier_heatmap(mc_id="test_mc", mat_id = "test", T_lfc=3) mcell_mc_split_filt(new_mc_id="test_mc_f", mc_id="test_mc", mat_id="test", T_lfc=3, plot_mats=F) mcell_gset_from_mc_markers(gset_id="test_markers", mc_id="test_mc_f") marks_colors = read.table(system.file("extdata", "pbmc_mc_colorize.txt", package="metacell"), sep="\t", h=T, stringsAsFactors=F) mc_colorize("test_mc_f", marker_colors=marks_colors) mc = scdb_mc("test_mc_f") table(mc@colors) mcell_mc_plot_marks(mc_id="test_mc_f", gset_id="test_markers", mat_id="test") lfp = log2(mc@mc_fp) tail(sort(lfp["CD8A",])) mcell_mc2d_force_knn(mc2d_id="test_2dproj",mc_id="test_mc_f", graph_id="test_graph") tgconfig::set_param("mcell_mc2d_height",1000, "metacell") tgconfig::set_param("mcell_mc2d_width",1000, "metacell") mcell_mc2d_plot(mc2d_id="test_2dproj") mc_hc = mcell_mc_hclust_confu(mc_id="test_mc_f", graph_id="test_graph") mc_sup = mcell_mc_hierarchy(mc_id="test_mc_f", mc_hc=mc_hc, T_gap=0.04) mcell_mc_plot_hierarchy(mc_id="test_mc_f", graph_id="test_graph", mc_order=mc_hc$order, sup_mc = mc_sup, width=2800, heigh=2000, min_nmc=2) library("metacell")

scdb_init("testdb/", force_reinit=T) figs_dir = "figs" scfigs_init(figs_dir) genes_anchors = c('MKI67', 'PCNA', 'TOP2A', 'TXN', 'HSP90AB1', 'FOS') tab_fn = paste(figs_dir, "lateral_gmods.txt", sep="/") gset_nm = "lateral" mcell_mat_rpt_cor_anchors(mat_id="test", gene_anchors = genes_anchors, cor_thresh = 0.1, gene_anti = c(), tab_fn = tab_fn, sz_cor_thresh = 0.2)

// At this point, I get the following error returned: Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : invalid character indexing

I originally ran this code using my data and when I got this error, I reran it using the exact (copy & paste) script from the vignette. I am not an experienced R user, so thought that I was missing something. The same error is showing up, so thought I'd ask for assistance. Thank you in advance!

aviezerl commented 4 years ago

We tried to run the code above on a two systems with similar specs to yours and did not get any error message. Could you please maybe send us your session information?

sessionInfo()
SeuratNewbie commented 4 years ago

Certainly. Perhaps I am missing a package or....? Thank you in advance (again!)!

R version 3.6.3 (2020-02-29) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Catalina 10.15.5

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] metacell_0.3.41 monocle3_0.2.1.9 SingleCellExperiment_1.8.0 [4] SummarizedExperiment_1.16.1 DelayedArray_0.12.3 BiocParallel_1.20.1
[7] matrixStats_0.56.0 GenomicRanges_1.38.0 GenomeInfoDb_1.22.1
[10] IRanges_2.20.2 S4Vectors_0.24.4 Biobase_2.46.0
[13] BiocGenerics_0.32.0

loaded via a namespace (and not attached): [1] nlme_3.1-148 tsne_0.1-3 bitops_1.0-6 doMC_1.3.6
[5] RcppAnnoy_0.0.16 RColorBrewer_1.1-2 httr_1.4.1 Rgraphviz_2.30.0
[9] tgutil_0.1.2 sctransform_0.2.1 tools_3.6.3 R6_2.4.1
[13] irlba_2.3.3 tgconfig_0.1.2 KernSmooth_2.23-17 uwot_0.1.8
[17] lazyeval_0.2.2 colorspace_1.4-1 tidyselect_1.1.0 gridExtra_2.3
[21] compiler_3.6.3 cli_2.0.2 graph_1.64.0 plotly_4.9.2.1
[25] entropy_1.2.1 Seurat_3.1.5 scales_1.1.1 lmtest_0.9-37
[29] ggridges_0.5.2 pbapply_1.4-2 stringr_1.4.0 digest_0.6.25
[33] dbscan_1.1-5 XVector_0.26.0 pkgconfig_2.0.3 htmltools_0.4.0
[37] tgstat_2.3.5 pdist_1.2 htmlwidgets_1.5.1 rlang_0.4.6
[41] rstudioapi_0.11 generics_0.0.2 zoo_1.8-8 jsonlite_1.6.1
[45] ica_1.0-2 dplyr_1.0.0 RCurl_1.98-1.2 magrittr_1.5
[49] GenomeInfoDbData_1.2.2 patchwork_1.0.0 Matrix_1.2-18 fansi_0.4.1
[53] Rcpp_1.0.4.11 munsell_0.5.0 viridis_0.5.1 ape_5.3
[57] reticulate_1.16 lifecycle_0.2.0 yaml_2.2.1 stringi_1.4.6
[61] MASS_7.3-51.6 zlibbioc_1.32.0 Rtsne_0.15 plyr_1.8.6
[65] grid_3.6.3 listenv_0.8.0 ggrepel_0.8.2 crayon_1.3.4
[69] lattice_0.20-41 cowplot_1.0.0 splines_3.6.3 pillar_1.4.4
[73] igraph_1.2.5 future.apply_1.5.0 reshape2_1.4.4 codetools_0.2-16
[77] leiden_0.3.3 glue_1.4.1 data.table_1.12.8 foreach_1.5.0
[81] png_0.1-7 vctrs_0.3.0 gtable_0.3.0 RANN_2.6.1
[85] purrr_0.3.4 tidyr_1.1.0 assertthat_0.2.1 future_1.17.0
[89] ggplot2_3.3.1 rsvd_1.0.3 survival_3.1-12 viridisLite_0.3.0
[93] tibble_3.0.1 iterators_1.0.12 cluster_2.1.0 globals_0.12.5
[97] fitdistrplus_1.1-1 ellipsis_0.3.1 ROCR_1.0-11

isabelsilverman commented 4 years ago

Hi I am running into the same error when trying to run the vignette. I was wondering if this issue was ever resolved?

tzeitim commented 4 years ago

I used to get the error below when trying to use genes that are not present in the matrix (e.g. MKI67)

Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : invalid character indexing

Make sure that the genes you are looking for are indeed in the matrix. For example:

mat = scdb_mat("test")
c('MKI67', 'PCNA', 'TOP2A', 'TXN', 'HSP90AB1', 'FOS') %in% rownames(mat@mat)
1156054203 commented 4 years ago

I am running into the same error when trying to run the vignette. I also run the code "c('MKI67', 'PCNA', 'TOP2A', 'TXN', 'HSP90AB1', 'FOS') %in% rownames(mat@mat)", the result is True. sessionInfo is in the following: `> sessionInfo() R version 4.0.0 (2020-04-24) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux Server 7.6 (Maipo)

Matrix products: default BLAS: /sibcb/program/install/r-4.0/lib64/R/lib/libRblas.so LAPACK: /sibcb/program/install/r-4.0/lib64/R/lib/libRlapack.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] metacell_0.3.41

loaded via a namespace (and not attached): [1] Rcpp_1.0.4.6 RColorBrewer_1.1-2 plyr_1.8.6 [4] pillar_1.4.4 compiler_4.0.0 iterators_1.0.12 [7] bitops_1.0-6 lifecycle_0.2.0 tibble_3.0.1 [10] gtable_0.3.0 lattice_0.20-41 pkgconfig_2.0.3 [13] rlang_0.4.6 foreach_1.5.0 Matrix_1.2-18 [16] igraph_1.2.5 graph_1.67.1 Rgraphviz_2.33.0 [19] yaml_2.2.1 parallel_4.0.0 cluster_2.1.0 [22] dplyr_1.0.0 doMC_1.3.6 generics_0.0.2 [25] vctrs_0.3.0 pdist_1.2 stats4_4.0.0 [28] grid_4.0.0 tidyselect_1.1.0 glue_1.4.1 [31] R6_2.4.1 tgconfig_0.1.2 ggplot2_3.3.2 [34] purrr_0.3.4 magrittr_1.5 codetools_0.2-16 [37] scales_1.1.1 ellipsis_0.3.1 BiocGenerics_0.35.4 [40] tgstat_2.3.5 colorspace_1.4-1 entropy_1.2.1 [43] RCurl_1.98-1.2 tgutil_0.1.2 munsell_0.5.0 [46] crayon_1.3.4 dbscan_1.1-5 zoo_1.8-8`

aviezerl commented 4 years ago

Fixed on current PR.