quadbio / Pando

Multiome GRN inference.
https://quadbio.github.io/Pando/
MIT License
106 stars 21 forks source link

"not-yet-implemented 'Matrix[<-' method" #17

Closed DanielYuhangLi closed 1 year ago

DanielYuhangLi commented 1 year ago

Hi, Thanks for this tool. I have been playing around with the tutorial with a set of paired scRNA/ATAC data but still having some trouble inferring GRN so was hoping to get some input.

This may be similar to issue #1, I have two separate seurat/signac datasets that are paired mouse samples which I integrated. I then converted the peaks assay to a chromatin assay and introduced mouse motifs from JASPAR and made a new motif_tf file. All is smooth at this point.

I then ran into the 'Rik' gene unexpected symbol issue so that I removed Rik from the initial seurat object prior to integration with the signac object. With my current gene set, I run through about 1400 of the 1900 genes and then the function fails with a "not-yet-implemented 'Matrix[<-' method" error. The error is pasted below. Looking through the code, I don't have a good sense of the order which the genes in the grn model is selected so it's hard for me to look back and see if there's any other data structure issue. I understand the original study uses multi-ome data so perhaps there is some limitation that I am missing but any input would be appreciated. Thanks! I can post the data structure too in an attachment pando_seuratplus_structure.txt .

Error code below - prior to this, I do seem to be able to fit multiple models for over 1000 genes before the function fails:

Appreciate your time!

============================

Error in .local(x, i, j, ..., value): not-yet-implemented 'Matrix[<-' method Traceback:

  1. infer_grn(test_object_grn, peak_to_gene_method = "Signac", method = "glm", . verbose = 2)
  2. infer_grn.SeuratPlus(test_object_grn, peak_to_gene_method = "Signac", . method = "glm", verbose = 2)
  3. fit_grn_models(object = object, genes = genes, network_name = network_name, . peak_to_gene_method = peak_to_gene_method, upstream = upstream, . downstream = downstream, extend = extend, only_tss = only_tss, . parallel = parallel, tf_cor = tf_cor, peak_cor = peak_cor, . aggregate_rna_col = aggregate_rna_col, aggregate_peaks_col = aggregate_peaks_col, . method = method, alpha = alpha, family = family, interaction_term = interaction_term, . adjust_method = adjust_method, scale = scale, verbose = verbose, . ...)
  4. fit_grn_models.SeuratPlus(object = object, genes = genes, network_name = network_name, . peak_to_gene_method = peak_to_gene_method, upstream = upstream, . downstream = downstream, extend = extend, only_tss = only_tss, . parallel = parallel, tf_cor = tf_cor, peak_cor = peak_cor, . aggregate_rna_col = aggregate_rna_col, aggregate_peaks_col = aggregate_peaks_col, . method = method, alpha = alpha, family = family, interaction_term = interaction_term, . adjust_method = adjust_method, scale = scale, verbose = verbose, . ...)
  5. map_par(features, function(g) { . if (!g %in% rownames(peaks2gene)) { . log_message("Warning: ", g, " not found in EnsDb", verbose = verbose == . 2) . return() . } . gene_peaks <- as.logical(peaks2gene[g, ]) . if (sum(gene_peaks) == 0) { . log_message("Warning: No peaks found near ", g, verbose = verbose == . 2) . return() . } . g_x <- gene_data[gene_groups, g, drop = FALSE] . peak_x <- peak_data[peak_groups, gene_peaks, drop = FALSE] . peak_g_cor <- as(sparse_cor(peak_x, g_x), "generalMatrix") . peak_g_cor[is.na(peak_g_cor)] <- 0 . peaks_use <- rownames(peak_g_cor)[abs(peak_g_cor[, 1]) > . peak_cor] . if (length(peaks_use) == 0) { . log_message("Warning: No correlating peaks found for ", . g, verbose = verbose == 2) . return() . } . peak_x <- peak_x[, peaks_use, drop = FALSE] . peak_motifs <- peaks2motif[gene_peaks, , drop = FALSE][peaks_use, . , drop = FALSE] . gene_peak_tfs <- map(rownames(peak_motifs), function(p) { . x <- as.logical(peak_motifs[p, ]) . peak_tfs <- colMaxs(motif2tf[x, , drop = FALSE]) . peak_tfs <- colnames(motif2tf)[as.logical(peak_tfs)] . peak_tfs <- setdiff(peak_tfs, g) . return(peak_tfs) . }) . names(gene_peak_tfs) <- rownames(peak_motifs) . gene_tfs <- purrr::reduce(gene_peak_tfs, union) . tf_x <- gene_data[gene_groups, gene_tfs, drop = FALSE] . tf_g_cor <- sparse_cor(tf_x, g_x) . tf_g_cor[is.na(tf_g_cor)] <- 0 . tfs_use <- rownames(tf_g_cor)[abs(tf_g_cor[, 1]) > tf_cor] . if (length(tfs_use) == 0) { . log_message("Warning: No correlating TFs found for ", . g, verbose = verbose == 2) . return() . } . tf_g_corr_df <- as_tibble(tf_g_cor[unique(tfs_use), , drop = F], . rownames = "tf", .name_repair = "check_unique") %>% rename(tf = 1, . corr = 2) . frml_string <- map(names(gene_peak_tfs), function(p) { . peak_tfs <- gene_peak_tfs[[p]] . peak_tfs <- peak_tfs[peak_tfs %in% tfs_use] . if (length(peak_tfs) == 0) { . return() . } . peak_name <- str_replaceall(p, "-", "") . tf_name <- str_replace_all(peaktfs, "-", "") . formula_str <- paste(paste(peak_name, interaction_term, . tf_name, sep = " "), collapse = " + ") . return(list(tfs = peak_tfs, frml = formula_str)) . }) . frml_string <- frml_string[!map_lgl(frml_string, is.null)] . if (length(frml_string) == 0) { . log_message("Warning: No valid peak:TF pairs found for ", . g, verbose = verbose == 2) . return() . } . target <- str_replaceall(g, "-", "") . model_frml <- as.formula(paste0(target, " ~ ", paste0(map(frml_string, . function(x) x$frml), collapse = " + "))) . nfeats <- sum(map_dbl(frml_string, function(x) length(x$tfs))) . gene_tfs <- purrr::reduce(map(frml_string, function(x) x$tfs), . union) . gene_x <- gene_data[gene_groups, union(g, gene_tfs), drop = FALSE] . model_mat <- as.data.frame(cbind(gene_x, peak_x)) . if (scale) . model_mat <- as.data.frame(scale(as.matrix(model_mat))) . colnames(model_mat) <- str_replace_all(colnames(modelmat), . "-", "") . log_message("Fitting model with ", nfeats, " variables for ", . g, verbose = verbose == 2) . result <- try(fit_model(model_frml, data = model_mat, method = method, . ...), silent = TRUE) . if (any(class(result) == "try-error")) { . log_message("Warning: Fitting model failed for ", g, . verbose = verbose) . log_message(result, verbose = verbose == 2) . return() . } . else { . result$gof$nvariables <- nfeats . result$corr <- tf_g_corr_df . return(result) . } . }, verbose = verbose, parallel = parallel)
  6. base::lapply(X = x, FUN = fun)
  7. FUN(X[[i]], ...)
  8. [<-(*tmp*, is.na(tf_g_cor), value = 0)
  9. [<-(*tmp*, is.na(tf_g_cor), value = 0)
  10. .local(x, i, j, ..., value)
  11. stop("not-yet-implemented 'Matrix[<-' method")

Session: R version 4.1.3 (2022-03-10) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Ubuntu 20.04.4 LTS

Matrix products: default BLAS/LAPACK: /home/lidaniel/mambaforge/envs/r_env/lib/libopenblasp-r0.3.21.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] doParallel_1.0.17 iterators_1.0.14
[3] foreach_1.5.2 grr_0.9.5
[5] TFBSTools_1.32.0 JASPAR2020_0.99.10
[7] EnsDb.Mmusculus.v79_2.99.0 ensembldb_2.18.1
[9] AnnotationFilter_1.18.0 GenomicFeatures_1.46.1
[11] AnnotationDbi_1.56.2 Biobase_2.54.0
[13] cowplot_1.1.1 ggplot2_3.4.0
[15] BSgenome.Mmusculus.UCSC.mm10_1.4.3 BSgenome.Hsapiens.UCSC.hg38_1.4.4 [17] BSgenome_1.62.0 rtracklayer_1.54.0
[19] Biostrings_2.62.0 XVector_0.34.0
[21] GenomicRanges_1.46.1 GenomeInfoDb_1.30.1
[23] IRanges_2.28.0 S4Vectors_0.32.4
[25] BiocGenerics_0.40.0 Signac_1.8.0
[27] sp_1.5-0 SeuratObject_4.1.2
[29] Seurat_4.2.0 Pando_1.0.1

loaded via a namespace (and not attached): [1] rappdirs_0.3.3 pbdZMQ_0.3-8
[3] scattermore_0.8 R.methodsS3_1.8.2
[5] tidyr_1.2.1 knitr_1.40
[7] bit64_4.0.5 R.utils_2.12.1
[9] irlba_2.3.5.1 DelayedArray_0.20.0
[11] data.table_1.14.4 rpart_4.1.19
[13] KEGGREST_1.34.0 RCurl_1.98-1.9
[15] generics_0.1.3 callr_3.7.3
[17] usethis_2.1.6 RSQLite_2.2.18
[19] RANN_2.6.1 future_1.28.0
[21] ggpointdensity_0.1.0 tzdb_0.3.0
[23] bit_4.0.4 spatstat.data_3.0-0
[25] xml2_1.3.3 httpuv_1.6.6
[27] SummarizedExperiment_1.24.0 assertthat_0.2.1
[29] DirichletMultinomial_1.36.0 viridis_0.6.2
[31] xfun_0.34 hms_1.1.2
[33] evaluate_0.17 promises_1.2.0.1
[35] fansi_1.0.3 restfulr_0.0.15
[37] progress_1.2.2 caTools_1.18.2
[39] dbplyr_2.2.1 igraph_1.3.5
[41] DBI_1.1.3 htmlwidgets_1.5.4
[43] spatstat.geom_3.0-3 purrr_0.3.5
[45] ellipsis_0.3.2 backports_1.4.1
[47] dplyr_1.0.10 annotate_1.72.0
[49] biomaRt_2.50.0 deldir_1.0-6
[51] sparseMatrixStats_1.6.0 MatrixGenerics_1.6.0
[53] vctrs_0.5.0 remotes_2.4.2
[55] ROCR_1.0-11 abind_1.4-5
[57] cachem_1.0.6 withr_2.5.0
[59] ggforce_0.4.1 progressr_0.11.0
[61] checkmate_2.1.0 sctransform_0.3.5
[63] GenomicAlignments_1.30.0 prettyunits_1.1.1
[65] goftest_1.2-3 cluster_2.1.4
[67] IRdisplay_1.1 lazyeval_0.2.2
[69] seqLogo_1.60.0 crayon_1.5.2
[71] pkgconfig_2.0.3 tweenr_2.0.2
[73] nlme_3.1-160 pkgload_1.3.0
[75] ProtGenerics_1.26.0 nnet_7.3-18
[77] pals_1.7 devtools_2.4.5
[79] rlang_1.0.6 globals_0.16.1
[81] lifecycle_1.0.3 miniUI_0.1.1.1
[83] filelock_1.0.2 BiocFileCache_2.2.0
[85] dichromat_2.0-0.1 rprojroot_2.0.3
[87] polyclip_1.10-4 matrixStats_0.62.0
[89] lmtest_0.9-40 Matrix_1.5-1
[91] IRkernel_1.3 zoo_1.8-11
[93] base64enc_0.1-3 ggridges_0.5.4
[95] processx_3.8.0 png_0.1-7
[97] viridisLite_0.4.1 rjson_0.2.21
[99] bitops_1.0-7 R.oo_1.25.0
[101] KernSmooth_2.23-20 blob_1.2.3
[103] stringr_1.4.1 parallelly_1.32.1
[105] spatstat.random_3.0-1 jpeg_0.1-9
[107] readr_2.1.3 CNEr_1.30.0
[109] scales_1.2.1 memoise_2.0.1
[111] magrittr_2.0.3 plyr_1.8.7
[113] ica_1.0-3 zlibbioc_1.40.0
[115] compiler_4.1.3 BiocIO_1.4.0
[117] RColorBrewer_1.1-3 fitdistrplus_1.1-8
[119] Rsamtools_2.10.0 cli_3.4.1
[121] urlchecker_1.0.1 listenv_0.8.0
[123] patchwork_1.1.2 pbapply_1.5-0
[125] ps_1.7.2 htmlTable_2.4.1
[127] Formula_1.2-4 MASS_7.3-58.1
[129] mgcv_1.8-41 tidyselect_1.2.0
[131] stringi_1.7.8 yaml_2.3.6
[133] latticeExtra_0.6-30 ggrepel_0.9.1
[135] grid_4.1.3 VariantAnnotation_1.40.0
[137] fastmatch_1.1-3 tools_4.1.3
[139] future.apply_1.9.1 rstudioapi_0.14
[141] uuid_1.1-0 TFMPvalue_0.0.9
[143] foreign_0.8-83 gridExtra_2.3
[145] farver_2.1.1 Rtsne_0.16
[147] ggraph_2.1.0 digest_0.6.30
[149] rgeos_0.5-9 pracma_2.4.2
[151] shiny_1.7.3 motifmatchr_1.16.0
[153] Rcpp_1.0.9 later_1.3.0
[155] RcppAnnoy_0.0.20 httr_1.4.4
[157] biovizBase_1.42.0 colorspace_2.0-3
[159] XML_3.99-0.12 fs_1.5.2
[161] tensor_1.5 reticulate_1.26
[163] splines_4.1.3 uwot_0.1.14
[165] RcppRoll_0.3.0 spatstat.utils_3.0-1
[167] graphlayouts_0.8.3 mapproj_1.2.9
[169] plotly_4.10.0 sessioninfo_1.2.2
[171] xtable_1.8-4 poweRlaw_0.70.6
[173] jsonlite_1.8.3 tidygraph_1.2.2
[175] R6_2.5.1 Hmisc_4.7-1
[177] profvis_0.3.7 pillar_1.8.1
[179] htmltools_0.5.3 mime_0.12
[181] glue_1.6.2 fastmap_1.1.0
[183] BiocParallel_1.28.3 codetools_0.2-18
[185] maps_3.4.1 pkgbuild_1.3.1
[187] utf8_1.2.2 lattice_0.20-45
[189] spatstat.sparse_3.0-0 tibble_3.1.8
[191] curl_4.3.3 leiden_0.4.3
[193] gtools_3.9.3 interp_1.1-3
[195] GO.db_3.14.0 survival_3.4-0
[197] repr_1.1.4 munsell_0.5.0
[199] GenomeInfoDbData_1.2.7 reshape2_1.4.4
[201] gtable_0.3.1 spatstat.core_2.4-4

DanielYuhangLi commented 1 year ago

I made some small changes in the attribute aspect when correlation matrixes are being read in and that solved the issue. Can look at the grn code in my github. Thanks for the great tool, helpful adjunct for other analysis.

joschif commented 1 year ago

Hi @DanielYuhangLi , it looks like this is related to the new Version of MatrixGenerics brought up in #10, I'll check out your fork and see if I can incorporate the changes

joschif commented 1 year ago

I hope this issue should be fixed in the newest version(v1.0.2) please let me know how it works for you :)