Closed dariober closed 2 years ago
Nope, nothing's wrong here, but I do understand and apologise for the bad example and resulting confusion! The mock data demonstrates how to compute pseudobulks using aggregateData
, but is wasn't intended to make sense for DS analysis. Specifically, the sampling of each samples group assignment is random, so that cells from a given sample could in fact have different group assignments. To mock up reasonable data, something like the following should work:
library(muscat)
ng <- 400 # number of genes
nc <- 200 # number of cells
ns <- 4 # number of samples
nk <- 3 # number of clusters
# generate some cell metadata
# (the following assures there are 2 samples from each group,
# and that each sample is uniquely assigned to one group)
sids <- rep(paste0("sample", seq_len(ns)), each = nc/ns)
gids <- rep(c("groupA", "groupB"), each = nc/2)
kids <- sample(paste0("cluster", seq_len(nk)), nc, TRUE)
batch <- sample(seq_len(3), nc, TRUE)
cd <- data.frame(group = gids, id = sids, cluster = kids, batch)
# construct SCE
library(scuttle)
sce <- mockSCE(ncells = nc, ngenes = ng)
colData(sce) <- cbind(colData(sce), cd)
# prep. for workflow
sce <- prepSCE(sce, kid = "cluster", sid = "id", gid = "group")
head(colData(sce))
metadata(sce)$experiment_info
# perform DS analysis
pb <- aggregateData(sce)
res <- pbDS(pb)
Hi- I'm creating an example dataset using the code from the documentation of
prepSCE
:When I pass the object
sce
toaggregateData
, the output is an object with colData having 0 columns:Am I doing something wrong?