kharchenkolab / cacoa

Single-cell Case Control Analysis
46 stars 7 forks source link

unable to use Cacoa on a seurat object #39

Closed GhobrialMoheb closed 1 year ago

GhobrialMoheb commented 1 year ago

Hi, may you please help provide a detailed way to use Cacoa on a seurat object?

I am getting errors I am not sure the reason of, e.g when running "cao$estimateCellLoadings()" I get: "Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 't': arguments imply differing number of rows: 47652, 0"

Thank you for your help

TeresaSteyn commented 1 year ago

I have also been facing the same error with a Seurat object. I was wondering if you've had any luck solving this error?

rrydbirk commented 1 year ago

@GhobrialMoheb @TeresaSteyn Thanks for your interest in Cacoa. Without more information, we're not able to track the problem. Could you provide a minimal reproducible example, traceback() of your errors, and your sessionInfo()?

Thanks.

/Rasmus

TeresaSteyn commented 1 year ago

Thank you very much for your response @rrydbirk! I am somewhat new to programming and so I will do my best to provide a minimal reproducible example, but you can let me know if there is additional information you require.

library(Seurat) library(tidyverse) library(cacoa) library(Matrix) library(stringr) library(dplyr) library(magrittr) library(coda.base) library(psych) library(reshape2) library(ggpubr)

seu <- readRDS("/scratch/styter001/snRNAseq/Mouse/Merged/mt_genes_removed/Annotation/Multi/results/seu_Allen_user_agreement.rds")

so <- seu

group_id <- as.data.frame(so@meta.data$group_id)

sample_id <- as.data.frame(so@meta.data$sample_id)

cluster_id <- as.data.frame(so@meta.data$cluster_id)

cao <- Cacoa$new(so, sample.groups=as.vector(so@meta.data$group_id), cell.groups=as.vector(so@meta.data$cluster_id), sample.per.cell=as.vector(so@meta.data$sample_id), ref.level="Ctrl", target.level="LPS", graph.name=Embeddings(so))

cao

Public: cache: list cell.groups: Oligos Dentate gyrus cells OPCs Astrocytes Astrocytes OP ... cell.groups.palette: #E61717 #E65817 #E69917 #E6DB17 #AFE617 #6EE617 #2DE617 ... clone: function (deep = FALSE) data.object: Seurat embedding: 0.728023233842537 1.90026259703605 -4.26855754570992 -8. ... estimateCellDensity: function (bins = 400, method = "kde", name = "cell.density", estimateCellLoadings: function (n.boot = 1000, ref.cell.type = NULL, name = "coda", estimateClusterFreeDE: function (n.top.genes = Inf, genes = NULL, max.z = 20, min.expr.frac = 0.01, estimateClusterFreeExpressionShifts: function (n.top.genes = 3000, gene.selection = "z", name = "cluster.free.expr.shifts", estimateDEPerCellType: function (cell.groups = self$cell.groups, sample.groups = self$sample.groups, estimateDEStabilityPerCellType: function (de.name = "de", name = "de.jaccards", top.n.genes = NULL, estimateDEStabilityPerGene: function (de.name, top.n.genes = 500, p.adj = NULL, visualize = FALSE) estimateDiffCellDensity: function (type = "permutation", adjust.pvalues = NULL, name = "cell.density", estimateExpressionShiftMagnitudes: function (cell.groups = self$cell.groups, sample.per.cell = self$sample.per.cell, estimateGenePrograms: function (method = c("pam", "leiden", "fabia"), n.top.genes = Inf, estimateMetadataSeparation: function (sample.meta, space = "expression.shifts", dist = NULL, estimateOntology: function (type = c("GO", "DO", "GSEA"), name = NULL, de.name = "de", estimateOntologyFamilies: function (name = "GO", p.adj = 0.05) estimatePerCellTypeDE: function (...) getConditionPerCell: function () getFamiliesPerGO: function (name = "GO", go.term = NULL, go.id = NULL, common = FALSE) getGOEnvironment: function (org.db, verbose = FALSE, ignore.cache = NULL) getJointCountMatrix: function (force = FALSE, raw = TRUE) getMostChangedGenes: function (n, method = c("z", "z.adj", "lfc"), min.z = 0.5, min.lfc = 1, getSampleDistanceMatrix: function (space = c("expression.shifts", "coda", "pseudo.bulk"), initialize: function (data.object, sample.groups = NULL, cell.groups = NULL, n.cores: 1 plot.params: NULL plot.theme: theme, gg plotCellDensity: function (show.grid = TRUE, add.points = TRUE, size = 0.1, show.legend = FALSE, plotCellDensityVariation: function (type = "mad", plot.type = "embedding", name = "cell.density", plotCellGroupAbundanceVariation: function (cell.groups = self$cell.groups, type = "mad", rotate.xticks = TRUE, plotCellGroupSizes: function (cell.groups = self$cell.groups, show.significance = FALSE, plotCellLoadings: function (alpha = 0.01, palette = self$cell.groups.palette, font.size = NULL, plotClusterFreeExpressionShifts: function (cell.groups = self$cell.groups, smooth = TRUE, plot.na = FALSE, plotCodaSpace: function (space = "CDA", cell.groups = self$cell.groups, font.size = 3, plotCompositionSimilarity: function (cell.groups = self$cell.groups, cells.to.remain = NULL, plotContrastTree: function (cell.groups = self$cell.groups, palette = self$sample.groups.palette, plotDEStabilityPerCellType: function (name = "de.jaccards", notch = FALSE, show.jitter = TRUE, plotDEStabilityPerGene: function (name = "de", cell.type = NULL, stability.score = "stab.median.rank") plotDiffCellDensity: function (type = NULL, name = "cell.density", size = 0.2, palette = NULL, plotEmbedding: function (embedding = self$embedding, color.by = NULL, plot.theme = self$plot.theme, plotExpressionDistance: function (name = "expression.shifts", joint = FALSE, palette = self$sample.groups.palette, plotExpressionShiftMagnitudes: function (name = "expression.shifts", type = "box", notch = TRUE, plotGeneExpressionComparison: function (genes = NULL, scores = NULL, max.expr = "97.5%", plots = c("z.adj", plotGeneProgramGenes: function (program.id, name = "gene.programs", ordering = c("similarity", plotGeneProgramScores: function (name = "gene.programs", prog.ids = NULL, build.panel = TRUE, plotMostChangedGenes: function (n.top.genes, method = "z", min.z = 0.5, min.lfc = 1, plotNumberOfDEGenes: function (name = "de", p.adjust = TRUE, pvalue.cutoff = 0.05, plotNumOntologyTermsPerType: function (name = "GO", genes = "up", p.adj = 0.05, q.value = 0.2, plotOntology: function (cell.type, name = "GO", plot = "dot", genes = c("up", plotOntologyFamily: function (name = "GO", cell.type, family = NULL, genes = "up", plotOntologyHeatmap: function (name = "GO", genes = "up", subtype = "BP", p.adj = 0.05, plotOntologyHeatmapCollapsed: function (name = "GO", genes = "up", subtype = "BP", p.adj = 0.05, plotOntologySimilarities: function (name = "GO", genes = "up", p.adj = 0.05, q.value = 0.2, plotSampleDistances: function (space = "expression.shifts", method = "MDS", dist = NULL, plotVolcano: function (name = "de", cell.types = NULL, palette = NULL, build.panel = TRUE, ref.level: Ctrl sample.groups: Ctrl Ctrl Ctrl Ctrl Ctrl Ctrl Ctrl Ctrl Ctrl Ctrl Ctrl C ... sample.groups.palette: #d73027 #4575b4 sample.per.cell: Ctrl1 Ctrl1 Ctrl1 Ctrl1 Ctrl1 Ctrl1 Ctrl1 Ctrl1 Ctrl1 Ct ... saveDEasJSON: function (saveprefix = NULL, dir.name = "JSON", de.raw = NULL, saveFamiliesAsTable: function (file, name = "GO", subtype = NULL, genes = NULL, p.adj = 0.05, saveOntologyAsTable: function (file, name = "GO", subtype = NULL, genes = NULL, p.adj = 0.05, smoothClusterFreeZScores: function (n.top.genes = 1000, smoothing = 20, filter = NULL, target.level: LPS test.results: list verbose: TRUE Private: checkCellEmbedding: function (embedding = self$embedding) extractCodaData: function (ret.groups = TRUE, cell.groups = self$cell.groups, getClusterFreeDEInput: function (genes, min.edge.weight = 0) getDensityContours: function (groups, color = "black", linetype = 2, conf = "10%", getOntologyHeatmapInfo: function (name = "GO", genes = "up", subtype = "BP", p.adj = 0.05, getOntologyPvalueResults: function (name, genes = c("up", "down", "all"), readjust.p = FALSE, getResults: function (name, suggested.function = NULL) getTopGenes: function (n, gene.selection = c("z", "z.adj", "lfc", "expression", cao$plot.params <- list(size=0.1, alpha=0.1, font.size=c(2, 3)) cao$plot.theme <- cao$plot.theme + theme(legend.background=element_blank()) cao$estimateCellLoadings() Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 't': arguments imply differing number of rows: 64557, 0 traceback() 10: h(simpleError(msg, call)) 9: .handleSimpleError(function (cond) .Internal(C_tryCatchHelper(addr, 1L, cond)), "arguments imply differing number of rows: 64557, 0", base::quote(data.frame(anno = ., group = self$sample.per.cell[names(.)]))) 8: stop(gettextf("arguments imply differing number of rows: %s", paste(unique(nrows), collapse = ", ")), domain = NA) 7: data.frame(anno = ., group = self$sample.per.cell[names(.)]) 6: table(.) 5: rbind(.) 4: t(.) 3: cell.groups %>% data.frame(anno = ., group = self$sample.per.cell[names(.)]) %>% table() %>% rbind() %>% t() 2: private$extractCodaData(cells.to.remove = cells.to.remove, cells.to.remain = cells.to.remain, samples.to.remove = samples.to.remove) 1: cao$estimateCellLoadings() sessionInfo() R version 4.2.0 (2022-04-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale: [1] LC_CTYPE=en_ZA.utf-8 LC_NUMERIC=C [3] LC_TIME=en_ZA.utf-8 LC_COLLATE=en_ZA.utf-8 [5] LC_MONETARY=en_ZA.utf-8 LC_MESSAGES=en_ZA.utf-8 [7] LC_PAPER=en_ZA.utf-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_ZA.utf-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ggpubr_0.4.0 reshape2_1.4.4 psych_2.2.9 coda.base_0.5.2 [5] magrittr_2.0.3 cacoa_0.4.0 Matrix_1.5-3 forcats_0.5.2 [9] stringr_1.4.1 dplyr_1.0.10 purrr_0.3.5 readr_2.1.3 [13] tidyr_1.2.1 tibble_3.1.8 ggplot2_3.4.0 tidyverse_1.3.2 [17] SeuratObject_4.1.3 Seurat_4.3.0

loaded via a namespace (and not attached): [1] googledrive_2.0.0 Rtsne_0.16 colorspace_2.0-3 [4] ggsignif_0.6.4 deldir_1.0-6 ellipsis_0.3.2 [7] ggridges_0.5.4 fs_1.5.2 spatstat.data_3.0-0 [10] leiden_0.4.3 listenv_0.8.0 ggrepel_0.9.2 [13] lubridate_1.9.0 fansi_1.0.3 xml2_1.3.3 [16] codetools_0.2-18 splines_4.2.0 mnormt_2.1.1 [19] polyclip_1.10-4 jsonlite_1.8.3 broom_1.0.1 [22] ica_1.0-3 cluster_2.1.3 dbplyr_2.2.1 [25] png_0.1-7 uwot_0.1.14 shiny_1.7.3 [28] sctransform_0.3.5 spatstat.sparse_3.0-0 compiler_4.2.0 [31] httr_1.4.4 backports_1.4.1 assertthat_0.2.1 [34] fastmap_1.1.0 lazyeval_0.2.2 gargle_1.2.1 [37] cli_3.4.1 later_1.3.0 htmltools_0.5.3 [40] tools_4.2.0 igraph_1.3.5 gtable_0.3.1 [43] glue_1.6.2 RANN_2.6.1 Rcpp_1.0.9 [46] carData_3.0-5 scattermore_0.8 cellranger_1.1.0 [49] vctrs_0.5.1 ape_5.6-2 spatstat.explore_3.0-5 [52] nlme_3.1-162 progressr_0.11.0 lmtest_0.9-40 [55] sccore_1.0.1 spatstat.random_3.0-1 globals_0.16.2 [58] rvest_1.0.3 timechange_0.1.1 mime_0.12 [61] miniUI_0.1.1.1 lifecycle_1.0.3 irlba_2.3.5.1 [64] rstatix_0.7.0 googlesheets4_1.0.1 goftest_1.2-3 [67] future_1.29.0 MASS_7.3-56 zoo_1.8-11 [70] scales_1.2.1 hms_1.1.2 promises_1.2.0.1 [73] spatstat.utils_3.0-1 parallel_4.2.0 RColorBrewer_1.1-3 [76] reticulate_1.26 pbapply_1.6-0 gridExtra_2.3 [79] stringi_1.7.8 rlang_1.0.6 pkgconfig_2.0.3 [82] matrixStats_0.63.0 lattice_0.20-45 ROCR_1.0-11 [85] tensor_1.5 patchwork_1.1.2 htmlwidgets_1.5.4 [88] cowplot_1.1.1 tidyselect_1.2.0 parallelly_1.32.1 [91] RcppAnnoy_0.0.20 plyr_1.8.8 R6_2.5.1 [94] generics_0.1.3 DBI_1.1.3 withr_2.5.0 [97] pillar_1.8.1 haven_2.5.1 fitdistrplus_1.1-8 [100] survival_3.3-1 abind_1.4-5 sp_1.5-1 [103] future.apply_1.10.0 car_3.1-0 crayon_1.5.2 [106] modelr_0.1.10 KernSmooth_2.23-20 utf8_1.2.2 [109] spatstat.geom_3.0-3 plotly_4.10.1 tzdb_0.3.0 [112] readxl_1.4.1 grid_4.2.0 data.table_1.14.6 [115] reprex_2.0.2 digest_0.6.30 xtable_1.8-4 [118] httpuv_1.6.6 munsell_0.5.0 viridisLite_0.4.1

rrydbirk commented 1 year ago

@TeresaSteyn There seems to be a naming mismatch somewhere. Unfortunately, I can't reproduce your example without your data file (/scratch/styter001/snRNAseq/Mouse/Merged/mt_genes_removed/Annotation/Multi/results/seu_Allen_user_agreement.rds).

From your traceback(), I can see that in step 3 some subsetting is going on whereafter the error occurs.

cell.groups %>% data.frame(anno = ., group = self$sample.per.cell[names(.)]) %>%
table() %>% rbind() %>% t()

Could you please check that the names match. cell.groups and sample.per.cell need cell IDs in names(). You could also show us head() of cell.groups, sample.per.cell, and sample.groups here if you're still in doubt.

/Rasmus

TeresaSteyn commented 1 year ago

Thank you so much for this response! I will take look at the cell.groups and sample.per.cell in the object and get back to you as soon as possible.

rrydbirk commented 1 year ago

@TeresaSteyn did you figure it out, can we close this issue? Best, Rasmus

TeresaSteyn commented 1 year ago

Hi Rasmus,

Thank you for this message! It has been such a busy time that I have not yet had a chance to return to this. But I am planning to do so in May. I think you can close the issue, and if I am really struggling to get it to work after trying out your suggestions then I will get in touch.

Thank you very much once again for your responses and help.

Kind regards, Teresa

On Tue, 11 Apr 2023, 16:45 Rasmus Rydbirk, @.***> wrote:

@TeresaSteyn https://github.com/TeresaSteyn did you figure it out, can we close this issue? Best, Rasmus

— Reply to this email directly, view it on GitHub https://github.com/kharchenkolab/cacoa/issues/39#issuecomment-1502925104, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5QCJM3AQ2FUTSFL3HDQAN3XAUK2JANCNFSM6AAAAAASBLFDTA . You are receiving this because you were mentioned.Message ID: @.***>