smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
316 stars 31 forks source link

Bug of metacell construction when sample size is small #90

Closed jaydu1 closed 1 year ago

jaydu1 commented 1 year ago

When the number of cells in a group is small, chosen will be a scalar instead of a vector. In this case, cell_sample returned in the following code will be a vector, instead of a matrix.

https://github.com/smorabit/hdWGCNA/blob/ac70d3423844d9f41eafacd4ea5562bd4d2f1fb8/R/metacells.R#L90

This will cause error in

https://github.com/smorabit/hdWGCNA/blob/ac70d3423844d9f41eafacd4ea5562bd4d2f1fb8/R/metacells.R#L96-L98

To fix the bug, L90 needs to be changed to:

cell_sample <- nn_map[chosen, , drop = FALSE]
smorabit commented 1 year ago

Can you provide an example with code where you ran into this issue?

jaydu1 commented 1 year ago

The following example constructed from the tutorial reproduce the error.

# single-cell analysis package
library(Seurat)

# plotting and data science packages
library(tidyverse)
library(cowplot)
library(patchwork)

# co-expression network analysis packages:
library(WGCNA)
library(hdWGCNA)

# using the cowplot theme for ggplot
theme_set(theme_cowplot())

# set random seed for reproducibility
set.seed(12345)

# optionally enable multithreading
enableWGCNAThreads(nThreads = 8)

# load the Zhou et al snRNA-seq dataset
seurat_obj <- readRDS('Zhou_2020.rds')

seurat_obj <- seurat_obj[,1:51]
seurat_obj <- NormalizeData(seurat_obj)
seurat_obj <- FindVariableFeatures(seurat_obj, selection.method = "vst", nfeatures = 2000)
seurat_obj <- ScaleData(seurat_obj)
seurat_obj <- RunPCA(seurat_obj)
seurat_obj <- RunUMAP(seurat_obj, dims = 1:50L)

seurat_obj <- SetupForWGCNA(
    seurat_obj,
    gene_select = "fraction", # the gene selection approach
    fraction = 0.05, # fraction of cells that a gene needs to be expressed in order to be included
    wgcna_name = "tutorial" # the name of the hdWGCNA experiment
)

# construct metacells  in each group
seurat_obj <- MetacellsByGroups(
    seurat_obj = seurat_obj,
    group.by = c("cell_type"), # specify the columns in seurat_obj@meta.data to group by
    k = 50, # nearest-neighbors parameter
    max_shared = 10, # maximum number of shared cells between two metacells
    ident.group = 'cell_type', # set the Idents of the metacell seurat object
    min_cells=2,
)

Error message

Error in apply(cell_sample, 1, function(x) { : 
  dim(X) must have a positive length

This happens when some groups have fewer cells (compared to k). The reasons is as I described in the previous comment.

smorabit commented 1 year ago

We suggest simply excluding groups with very few cell numbers, rather than using the modification that you proposed.

BGI-TaoWang commented 2 weeks ago

@smorabit how to exclude groups with very few cell numbers ? before all steps ? can you give an example ? thanks