corceslab / CHOIR

CHOIR : Clustering Hierarchy Optimization by Iterative Random forests (www.CHOIRclustering.com)
MIT License
20 stars 5 forks source link

Running Harmony Batch Correction & Providing Clusters to pruneTree #8

Closed ejscience closed 6 months ago

ejscience commented 7 months ago

Hello I am trying to run CHOIR on a multi-sample seurat object where I have 2 issues. (1) I got an error during harmony based tree creation. and (2) I do not understand how to extract the necessary information for other

(1) Fixing an error with Harmony Batch Correction in CHOIR

seurat_object <- CHOIR(seurat_object, 
                       batch_correction_method = "Harmony",
                       batch_labels = "sample", 
                       n_cores = 38)

I got the following error with the traceback after:

## Error -
2024-02-02 04:50:23 AM : 94% (Subtree 10/14, 289 cells), 292 total clusters.              
Generating subtrees.. [=============================================>---]  95% in 00:38:13Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

## Traceback
 Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

17. stop("contrasts can be applied only to factors with 2 or more levels")

16.`contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]])

15.Matrix::sparse.model.matrix(~0 + as.factor(meta_data[[var_use]]))

14.FUN(X[[i]], ...)

13.lapply(vars_use, function(var_use) {
    res <- Matrix::sparse.model.matrix(~0 + as.factor(meta_data[[var_use]]))
    Matrix::t(res)
})

12.Reduce(rbind, lapply(vars_use, function(var_use) {
    res <- Matrix::sparse.model.matrix(~0 + as.factor(meta_data[[var_use]]))
    Matrix::t(res)
}))

11.tryCatchList(expr, classes, parentenv, handlers)

10.tryCatch({
    check_legacy_args(...)
    if (set.cores) {
        prev.ncores.blas <- RhpcBLASctl::blas_get_num_procs() ...

9.RunHarmony.default(...)

8.RunHarmony(...)

7.(function (...) 
{
    .Deprecated("RunHarmony", msg = "HarmonyMatrix is deprecated and will be removed in the future from the API in the future")
    RunHarmony(...) ...

6.do.call(harmony::HarmonyMatrix, c(list(data_mat = reduction_coords, 
    meta_data = metadata, vars_use = batch_labels), batch_correction_params))

5.withCallingHandlers(expr, warning = function(w) if (inherits(w, 
    classes)) tryInvokeRestart("muffleWarning"))

4.suppressWarnings(do.call(harmony::HarmonyMatrix, c(list(data_mat = reduction_coords, 
    meta_data = metadata, vars_use = batch_labels), batch_correction_params)))

3. .runDimReduction(object = object, normalization_method = normalization_method, 
    reduction_method = reduction_method, reduction_params = reduction_params, 
    n_var_features = n_var_features, batch_correction_method = batch_correction_method, 
    batch_correction_params = batch_correction_params, batch_labels = batch_labels,  ...

2.buildTree(object = object, key = key, alpha = alpha, p_adjust = p_adjust, 
    feature_set = feature_set, exclude_features = exclude_features, 
    n_iterations = n_iterations, n_trees = n_trees, use_variance = use_variance, 
    min_accuracy = min_accuracy, min_connections = min_connections,  ...

1.CHOIR(seurat_object, batch_correction_method = "Harmony", batch_labels = "sample", 
    n_cores = 38)

(2) Help with Providing pre-generated clusters

I am interested in pulling out clusters and comparing between CCA and Harmony integration reductions. I would like to see if anyone had some code for how to extract the necessary matrices to pruneTree()? Specifically getting cluster_tree, nn_matrix and reduction matrix from a seurat object.

Currently I run the following code. Where [ReductionToTest] is replaced with the integration method

seurat_object <- FindNeighbors(seurat_object, dims = 1:30, reduction = "[ReductionToTest]")
seurat_object <- FindClusters(seurat_object, resolution = seq(0.1, 4, by = 0.2))

# visualization of clusters
p = clustree(sample, layout = "sugiyama", prefix = "RNA_snn_res.") + 
    labs(title = paste0(sample_name, " Clustree")) +
    theme(plot.title = element_text(hjust = 0.5))
catpetersen commented 7 months ago

Hi! To help with the first part, a couple clarifying questions:

  1. How many samples are in your dataset? You're using each sample as an individual batch, correct?
  2. How extreme are your batch effects? I haven't seen this error before, but my hunch is that one of the "subtrees" CHOIR has extracted may be composed of 289 cells from just 1 batch. Does that sound plausible?

For your second question, please see this thread #6, which will hopefully provide the code you're looking for. You'll need to install the dev branch of CHOIR. Note that providing pre-generated clusters and dimensionality reductions is not the default application of CHOIR, and has not been thoroughly benchmarked.

ejscience commented 7 months ago

Hi @catpetersen,

  1. there are a total of 4 samples. I expect that the batch effects will be large but I am looking at 4 teratomas so I also suspect to have similar cell-types across the samples.

  2. I think that you are totally right that it could be that the subtree has all samples from 1 batch. One of them is significantly smaller in terms of cells than the others!

catpetersen commented 6 months ago

So first I'll just say that CHOIR's batch correction was designed to work best with batches that include more than 1 sample. I think your data may make it difficult to disentangle cell types that are truly shared among individual teratomas, vs. those that are unique to one sample. But it really depends on the dataset & extent of batch effects.

That said, I've made some changes in the 'dev' branch so that CHOIR will omit the Harmony step for any subtrees containing only a single batch, and just spit out a warning instead.

Would you try reinstalling CHOIR with:

devtools::install_github("corceslab/CHOIR", ref="dev", repos = BiocManager::repositories(), upgrade = "never")
# To unload a package and reload
detach("package:CHOIR", unload=TRUE)
library(CHOIR)

And let me know if that resolves the error during subtree generation?

catpetersen commented 6 months ago

Hi! I'm closing this issue for now, but please feel free to reply here if you have any additional issues and I'll reopen it.