saeyslab / multinichenetr

MultiNicheNet: a flexible framework for differential cell-cell communication analysis from multi-sample multi-condition single-cell transcriptomics data
GNU General Public License v3.0
107 stars 14 forks source link

Error in muscat::pbDS #54

Closed emmmasu closed 4 months ago

emmmasu commented 7 months ago

Hello!

I'm running into the following error when trying to run get_DE_info

Error in muscat::pbDS(pb, method = de_method_oi, design = design, contrast = contrast, : Specified filtering options result in no genes in any clusters being tested. To force testing, consider modifying arguments 'min_cells' and/or 'filter'. See '?pbDS' for details. Traceback:

  1. muscat::pbDS(pb, method = de_method_oi, design = design, contrast = contrast, . min_cells = min_cells, verbose = FALSE, filter = "none")
  2. stop("Specified filtering options result in no genes in any clusters ", . "being tested. To force testing, consider modifying arguments ", . "'min_cells' and/or 'filter'. See '?pbDS' for details.")

I've tried setting min_cells to 1, and the filtering options to the minimum (filterByExpr.min.count = 0, filterByExpr.min.total.count = 0, filterByExpr.large.n = 0, filterByExpr.min.prop = 0.01). I've also checked my input data and it has nonzero expression values, cell counts, and is in the correct format.

I also tried running muscat::pbDS directly instead of using the wrapper function and still run into the same issue. I'm currently using multinichenetr_1.0.3.

Any ideas on what the issue is, and how to find and solve it? Please let me know if I can provide any other information. Thanks so much!

browaeysrobin commented 7 months ago

Hi @emmmasu

Unfortunately, I have no clear idea what might be going on. However, I can check out the following information:

In the process of developing the next version of MultiNicheNet (nov on the dev-branch, with this vignette https://github.com/saeyslab/multinichenetr/blob/dev-branch/vignettes/basic_analysis_steps_MISC.Rmd pretty up-to-date), we decided to step away from filterByExpr because it is not easy to interpret what happens in terms of filtering and what all the parameter cutoffs practically mean with respect to pseudobulk data. Instead, we decided to go for the commonly applied filtering approach in single-cell data, namely filtering based on the fraction of cells in a cluster that express a gene - and based on this criterion, a certain proportion of samples should express the gene, for the gene being kept in the pseudobulk DE analysis.

If you install the development version, check out that vignette, then you will see that we explicitly run a command to calculate this expression information: frq_list = get_frac_exprs(sce = sce, sample_id = sample_id, celltype_id = celltype_id, group_id = group_id, batches = batches, min_cells = min_cells, fraction_cutoff = fraction_cutoff, min_sample_prop = min_sample_prop) This object can give you more insight into what may have gone wrong on your data.

Please let me know the result once you checked it, and found out your issue!

emmmasu commented 7 months ago

Hi! Thanks for getting back so quickly!

A couple questions:

Thank you so much for your help!

browaeysrobin commented 7 months ago

Hi @emmmasu

devtools::install_github("saeyslab/multinichenetr", "dev-branch") will install the development version (including the new functions described in that vignette).

The pipeline is quite modular, so you can always perform DE analysis by an altnerative method, as long as the final output is in the same output as required. However, we do not actively support/recommend this.

emmmasu commented 7 months ago

Hello!

I ran the get_frac_exprs function, and got the following: Joining with by = join_by(sample, group) [1] "Samples are considered if they have more than 10 cells of the cell type of interest" Joining with by = join_by(sample, celltype) [1] "Genes with non-zero counts in at least 5% of cells of a cell type of interest in a particular sample will be considered as expressed in that sample." [1] "Genes expressed in at least 2 samples will considered as expressed in the cell type: stromal" [1] "Genes expressed in at least 2 samples will considered as expressed in the cell type: tumor" Joining with by = join_by(sample) Joining with by = join_by(celltype) Joining with by = join_by(sample, celltype, group) [1] "9489 genes are considered as expressed in the cell type: stromal" [1] "8319 genes are considered as expressed in the cell type: tumor"

It appears I do have genes, but then following that, I ran into the same issue again with get_DE_info as above. Any suggestions on what to do?

Thank you so much for your help!

browaeysrobin commented 6 months ago

Hi @emmmasu

Do you get this error on the data in the vignette/tutorial or only on your data?

if you only get this on your data, can you provide the cell type abundance plots and all relevant information regarding your setup (contrasts_oi, contrast_tbl, batch/covariate information, ...)

Can you also confirm that you only load in the packages that are loaded in the vignettes (to avoid issues with dependency conflicts?)

Can you also confirm that you run the example vignette of the dev-branch exactly as written out there, and not only the gene-filtering line of code?