Open anastasiiaNG opened 2 years ago
Hi @anastasiiaNG
Thanks for the issue. Cacoa is still in development of course. These issues provide useful user feedback.
I currently tried to apply it to our data but faced several issues: cao object can be downloaded here.
Could you share how this data was created? Please share the code used. If we are able to reproduce this, we'll be able to provide better feedback.
Thanks
Hi, @evanbiederstedt,
This is how the cao
object was created from the seurat object whole.integrated:
library(cacoa)
library(Seurat)
library(ggplot2)
library(tidyverse)
whole.integrated <- readRDS("whole.integrated.rds")
whole.integrated <- FindNeighbors(whole.integrated,
k.param = 100, # instead of default k = 20
prune.SNN = 0, # instead of default prune.SNN = 1/15
)
whole.integrated <- FindClusters(whole.integrated, resolution = 3)
sample.groups <- factor(c("condition2", "condition2", "condition1", "condition1", "condition1", "condition2"))
names(sample.groups) <- c("R_1", "R_2", "R_3", "R_4", "R_5", "R_8")
cell.groups <- as.character(whole.integrated@meta.data$integrated_snn_res.3)
names(cell.groups) <- rownames(whole.integrated@meta.data)
sample.per.cell <- whole.integrated@meta.data$sample
names(sample.per.cell) <- rownames(whole.integrated@meta.data)
ref.level <- "condition1"
target.level <- "condition2"
cao <- Cacoa$new(whole.integrated,
sample.groups = sample.groups,
cell.groups = cell.groups,
sample.per.cell=sample.per.cell,
ref.level=ref.level,
target.level=target.level,
graph.name="integrated_nn",
embedding = whole.integrated@reductions$umap@cell.embeddings)
bs <- if (length(levels(cao$cell.groups)) > 20) 10 else 16
cao$plot.theme <- theme_bw(base_size = bs) +
theme(plot.title=element_text(hjust = 0.5),
legend.background=element_rect(fill=alpha("white", 0.2)),
legend.text=element_text(size=12),
legend.margin=margin(6, 6, 4, 1, 'pt'),
plot.margin=margin())
cao$cell.groups.palette <- levels(cao$cell.groups) %>%
{setNames(sample(brewerPalette("Paired")(length(.))), .)}
cao$sample.groups.palette <- c("#d73027", "#4575b4") %>%
setNames(c(cao$target.level, cao$ref.level))
Hi,
Thank you very much for providing such great insight into case-control analysis in scRNA-seq and for creating an incredibly useful tool.
@evanbiederstedt I want to mention that I'm running into the same errors described by @anastasiiaNG. Also using a seurat object. Any help you can provide is greatly appreciated. I'm happy to provide further details as needed. I generated the coa object similarly to anastasiiaNG.
Thanks!
Hi @anastasiiaNG and @ianphelps ,
I recently released v0.3.0 together with an updated walkthrough. It has a lot of changes and bug fixes, could you please try it? I tested it on the object you provided, and the function calls work fine.
There was just something wrong with the ggplot theme in this object, so I added cao$plot.theme <- theme_bw()
.
Hi @anastasiiaNG and @ianphelps , I recently released v0.3.0 together with an updated walkthrough. It has a lot of changes and bug fixes, could you please try it? I tested it on the object you provided, and the function calls work fine. There was just something wrong with the ggplot theme in this object, so I added
cao$plot.theme <- theme_bw()
.
Hi, @VPetukhov
Thank you for the updates.
Does the updated walkthrough work correct in your hands on my data?
I've tested it on the cao
object that I obtain just like in the https://github.com/kharchenkolab/cacoa/issues/21#issuecomment-1075157459 (with cao$plot.theme <- theme_bw()
as you recommend).
The updated walkthrough works correct until this step:
sample_meta <- whole.integrated@meta.data
pvals <- cao$estimateMetadataSeparation(sample.meta=sample_meta)$padjust
# Error in x[...] <- m : replacement has length zero
cao$plotSampleDistances(
space="expression.shifts",
sample.colors=sample_meta$condition, legend.position=c(0, 1), font.size=2
)
# Error: Must request at least one colour from a hue palette.
Functions cao$plotVolcano
, cao$plotOntologyHeatmapCollapsed
, cao$plotOntologyHeatmap
also fail to plot the results even though cao$estimateDEPerCellType
, cao$estimateDEPerCellType
, cao$estimateOntology
work without errors:
> cao$plotVolcano(xlim=c(-3, 3), ylim=c(0, 3.5), lf.cutoff=1)
Error in grid.Call(C_convert, x, as.integer(whatfrom), as.integer(whatto), :
Viewport has zero dimension(s)
In addition: There were 15 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: Removed 3 rows containing missing values (geom_point).
2: Removed 3 rows containing missing values (geom_point).
3: Removed 15 rows containing missing values (geom_point).
4: Removed 21 rows containing missing values (geom_point).
5: Removed 40 rows containing missing values (geom_point).
6: Removed 26 rows containing missing values (geom_point).
7: Removed 135 rows containing missing values (geom_point).
8: Removed 4 rows containing missing values (geom_point).
9: Removed 11 rows containing missing values (geom_point).
10: Removed 5 rows containing missing values (geom_point).
11: Removed 3 rows containing missing values (geom_point).
12: Removed 8 rows containing missing values (geom_point).
13: Removed 1 rows containing missing values (geom_point).
14: Removed 12 rows containing missing values (geom_point).
15: Removed 13 rows containing missing values (geom_point).
> cao$plotOntologyHeatmapCollapsed(
+ name="GSEA", genes="up", n=50, clust.method="ward.D", size.range=c(1, 4)
+ )
Error: Can't rename columns that don't exist.
x Column `core_enrichment` doesn't exist.
> cao$plotOntologyHeatmap(
+ name="GSEA", genes="up", description.regex="extracellular|matrix"
+ )
Error: Can't rename columns that don't exist.
x Column `core_enrichment` doesn't exist.
And starting from "Cluster-free. Compositional changes" none of the functions works on my data:
> cao$estimateCellDensity(method='graph')
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'as.matrix': Please supply a symmetric matrix if you want to create a weighted graph with mode=UNDIRECTED.
> cao$estimateClusterFreeExpressionShifts(
+ gene.selection="expression", min.n.between=3, min.n.within=3
+ )
Error in graph.adjacency.sparse(adjmatrix, mode = mode, weighted = weighted, :
Please supply a symmetric matrix if you want to create a weighted graph with mode=UNDIRECTED.
> cao$estimateClusterFreeDE(
+ n.top.genes=1000, min.expr.frac=0.01, adjust.pvalues=TRUE, smooth=TRUE,
+ verbose=TRUE
+ )
Estimating cluster-free Z-scores for 1000 most expressed genes
Error in graph.adjacency.sparse(adjmatrix, mode = mode, weighted = weighted, :
Please supply a symmetric matrix if you want to create a weighted graph with mode=UNDIRECTED.
Hi @anastasiiaNG
I'm following your instructions to recreate the data above, using the data downloaded from your link.
Could you please uninstall cacoa
and re-install it? i.e. remove.packages('cacoa')
and then devtools::install_github("kharchenkolab/cacoa")
RE: 1.
cao$estimateExpressionShiftMagnitudes() fails with the following message:
This works for me now:
> cao$estimateExpressionShiftMagnitudes()
Filtering data...
done!
Calculating pairwise distances using dist='cor'...
|=======================================================| 100%, Elapsed 00:10
Done!
RE: 2
This works for me too. Please check again.
> cao$estimateCellDensity()
> cao$estimateDiffCellDensity(type='permutation', verbose=T)
|=======================================================| 100%, Elapsed 00:00
|=======================================================| 100%, Elapsed 00:02
>
I tried using your data and running through http://pklab.med.harvard.edu/viktor/cacoa/walkthrough_short.html
It's true there are some errors, which are surely associated with some extra functionality we need to incorporate for Seurat objects. I'll try to help out with this when I get some time.
In the meantime, please try using a Conos object instead of a Seurat object if you could: https://github.com/kharchenkolab/conos
Thanks, Evan
Thanks, @anastasiiaNG ! Right, I also checked other functions and indeed there are errors in them. It's a bit of a pain to debug, as the part with core_enrichment
comes from the DOSE
, which returns a data frame without this column on your data. I have no idea why, so will need to dig into their specific. I'll fix it in the next couple of weeks.
@anastasiiaNG , I released version 0.4.0, where all this should be fixed. Could you please check it?
@anastasiiaNG , I released version 0.4.0, where all this should be fixed. Could you please check it?
@VPetukhov , great, thanks a lot, almost all bugs are fixed now.
The only thing that still produces an error is here (probably due to the absence of factors that significantly affect sample differences):
sample_meta <- whole.integrated@meta.data[, c(4,7)]
head(as.data.frame(sample_meta))
pvals <- cao$estimateMetadataSeparation(sample.meta=sample_meta)$padjust
# Error in x[...] <- m : replacement has length zero
I also noticed that in estimating gene programs you put by default z.adj=TRUE
and smooth=FALSE
, however warning says that Gene programs without smoothing often produce intractable results, especially with z.adj=FALSE
. Are z.adj=TRUE
and smooth=TRUE
the best option here? Or how to decide when these parameters should be set to TRUE
?
Hi, I ran into an error when I called cao$plotOntologyHeatmapCollapsed
function. I got an error
Error in `chr_as_locations()`:
! Can't rename columns that don't exist.
✖ Column `qvalues` doesn't exist.
Before I come to this step, everything ran well and functions cao$estimateDEPerCellType
and cao$estimateOntology
(I used'GSEA'
)also worked. Here's some output info I got from rlang::last_error()
:
> pdf(paste0(rep.name, '/', rep.name,"_plotOntologyHeatmapCollapsed_up_in_MUT.pdf"), height=15, width=10)
> print(cao$plotOntologyHeatmapCollapsed(
+ name="GSEA", genes="up", n=50, clust.method="ward.D", size.range=c(1, 4)) + ggtitle("Up in MUT"))
Error in (function (cond) :
error in evaluating the argument 'x' in selecting a method for function 'print': Can't rename columns that don't exist.
✖ Column `qvalues` doesn't exist.
> rlang::last_error()
<error/vctrs_error_subscript_oob>
Error in `chr_as_locations()`:
! Can't rename columns that don't exist.
✖ Column `qvalues` doesn't exist.
---
Backtrace:
1. cao$plotOntologyHeatmapCollapsed(...)
6. dplyr:::rename.data.frame(., geneID = core_enrichment, qvalue = qvalues)
7. tidyselect::eval_rename(expr(c(...)), .data)
8. tidyselect:::rename_impl(...)
9. tidyselect:::eval_select_impl(...)
18. tidyselect:::vars_select_eval(...)
19. tidyselect:::walk_data_tree(expr, data_mask, context_mask, error_call)
20. tidyselect:::eval_c(expr, data_mask, context_mask)
21. tidyselect:::reduce_sels(node, data_mask, context_mask, init = init)
22. tidyselect:::walk_data_tree(new, data_mask, context_mask)
23. tidyselect:::as_indices_sel_impl(...)
24. tidyselect:::as_indices_impl(x, vars, call = call, strict = strict)
25. tidyselect:::chr_as_locations(x, vars, call = call)
Run `rlang::last_trace()` to see the full context.
I wonder is there anything I can do to fix this problem to generate GSEA results? Thank you so much for the help in advance!
Dear Cacoa team,
Thank you for creating such a perspective tool for analysis single cell RNAseq data.
I currently tried to apply it to our data but faced several issues:
cao$estimateExpressionShiftMagnitudes()
fails with the following message:Could you please specify what exactly is wrong with the input data?
cao$estimateDiffCellDensity(type='permutation', verbose=T)
function also fails with the following message:Error in smoothSignalOnGraph(., filter = g.filt, graph = graph) : The provided graph is not connected. It has to be connected to estimate l.max.
I tried to fix it by changing the
k.param
andprune.SNN
arguments to make the graph of the original Seurat object more connected in the following way......but it didn't help.
Respectively, function
plotDiffCellDensity()
works only withpermutation
parameter, otherwise it returns the same error.Could you please clarify how these issues may be fixed? Thank you!
cao
object can be downloaded here.