Closed venkan closed 8 months ago
Hi, I have never come across this error before. Are you able to reproduce it on the tutorial dataset? I am also wondering if you have any NA
values in your modules table:
any(is.na(GetModules(seurat_obj)))
Hi, thank you for your reply. Yes, there are 4450 genes containing missing values (NA)
in the kME column of umap_df
, but still they are assigned to some Modules. Do you think I should remove those genes with NA
in the kME column
?
I understand that you have NA
values in umap_df
, but above I was asking about a different table that you can access using GetModules
.
okay. yes, it shows TRUE
.
I checked the output of GetModules(seurat_obj)
, and there are 157 genes showing NA
in all columns of kME
for each module. Showing below a small piece of that.
NA
in the kME of the modules, does it mean they don't show any module membership?seurat_obj
to run ModuleUMAPPlot
? If so, how to do that?I am not sure why there would be any NA
values in these tables, I have never come across this before. Could you show your code for the previous analysis steps? A temporary fix would be to remove the genes with NA
values but it would be nice to know where these originate from.
To remove the NA
genes:
# get the modules dataframe
modules <- GetModules(seurat_obj)
# remove NAs
modules <- na.omit(modules)
# update the modules dataframe
seurat_obj <- SetModules(seurat_obj, modules)
You should be able to re-run RunModuleUMAP
after this.
I ran RunModuleUMAP
already, and was able to create a network with ggplot
, even though there are NA
You can find the attachment here, I mentioned all the steps in it. I used custom genes as input (19996 genes).
Can you just paste the code directly into this conversation so it is easy to refer back to? Thanks.
Okay here it is:
seurat_obj <- SetupForWGCNA(scRNA,
gene_select = "custom", # the gene selection approach
gene_list = features, ## custom genes were used total 19996 genes used
wgcna_name = "hdWGCNA") # the name of the hdWGCNA experiment
seurat_obj <- MetacellsByGroups(seurat_obj = seurat_obj,
group.by = c("seurat_clusters"), # specify the columns in seurat_obj@meta.data to group by
assay = 'MAGIC_RNA',
slot = 'data',
reduction = 'pca', # select the dimensionality reduction to perform KNN on
k = 25, # nearest-neighbors parameter
max_shared = 10, # maximum number of shared cells between two metacells
ident.group = 'seurat_clusters') # set the Idents of the metacell seurat object
# normalize metacell expression matrix:
seurat_obj <- NormalizeMetacells(seurat_obj, verbose=FALSE)
seurat_obj <- SetDatExpr(seurat_obj,
group_name = c("0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16"), # the name of the group of interest in the group.by column
group.by='seurat_clusters', # the metadata column containing the cell type info. This same column should have also been used in MetacellsByGroups
assay = 'MAGIC_RNA', # using RNA assay
slot = 'data') # using normalized data
# Test different soft powers:
seurat_obj <- TestSoftPowers(seurat_obj,
networkType = 'unsigned') # you can also use "unsigned" or "signed hybrid"
# plot the results:
plot_list <- PlotSoftPowers(seurat_obj)
# assemble with patchwork
wrap_plots(plot_list, ncol=2)
## Construct co-expression network
seurat_obj <- ConstructNetwork(seurat_obj, soft_power=4,
setDatExpr=FALSE,
networkType = 'unsigned',
TOMType = "unsigned",
tom_name = 'TOM') # name of the topoligical overlap matrix written to disk
# need to run ScaleData first or else harmony throws an error:
seurat_obj <- ScaleData(seurat_obj, features=VariableFeatures(seurat_obj))
# compute all MEs in the full single-cell dataset
seurat_obj <- ModuleEigengenes(seurat_obj,
group.by.vars="seurat_clusters")
# compute eigengene-based connectivity (kME):
seurat_obj <- ModuleConnectivity(seurat_obj,
group.by = 'seurat_clusters', group_name = '2',sparse=FALSE)
# get the module assignment table:
modules <- GetModules(seurat_obj)
# show the first 6 columns:
head(modules[,1:6])
dim(modules)
#write.csv(modules, "Modules.csv", row.names=FALSE)
seurat_obj <- RunModuleUMAP(
seurat_obj,
n_hubs = 10, # number of hub genes to include for the UMAP embedding
n_neighbors=15, # neighbors parameter for UMAP
min_dist=0.1) # min distance between points in UMAP space
I am not really sure what could be causing these NA
values to show up just by looking at your code. I would be able to help better if I could reproduce this behavior on my own machine but I am not able to do so.
However, I can see a couple of other potential issues with your analysis based on this code.
First, I can see that in MetacellsByGroups
you are using an assay called "MAGIC_RNA". I am just wondering if you ran MAGIC to impute gene expression values before hdWGCNA? I personally haven't tried this and I do not think that it is necessary before running hdWGCNA so you should consider switching to the "RNA" assay and the "counts" slot.
Second and more importantly, based on how you are running ModuleEigengenes
, you are telling it to remove "batch effects" on the basis of the cell cluster. You should set this to a more appropriate variable (sequencing batch, replicate, dataset of origin, etc) or you can just remove this argument entirely.
Yes, I used MAGIC
before hdWGCNA
. There are a few genes I'm interested in which have low expression. All my other analysis was done with MAGIC-imputed expression values. So, I worked with the same for hdWGCNA
.
For ModuleEigengenes
, okay I understand what you mean. I removed the argument now.
Also for ModuleUMAPPlot
, as you said I removedNA
genes and re-run the ModuleUMAPPlot
, but still have an error:
Here is the ERROR:
Error in if (col1 == col2) {: the condition has length > 1
Traceback:
1. ModuleUMAPPlot(seurat_obj, edge.alpha = 0.25, sample_edges = TRUE,
. edge_prop = 0.1, label_hubs = 2, keep_grey_edges = FALSE)
2. future.apply::future_sapply(1:nrow(edge_df), function(i) {
. gene1 = as.character(edge_df[i, "Var1"])
. gene2 = as.character(edge_df[i, "Var2"])
. col1 <- selected_modules[selected_modules$gene_name == gene1,
. "color"]
. col2 <- selected_modules[selected_modules$gene_name == gene2,
. "color"]
. if (col1 == col2) {
. col = col1
. }
. else {
. col = "grey90"
. }
. col
. })
3. future_lapply(X = X, FUN = FUN, ..., future.envir = future.envir,
. future.label = future.label)
4. future_xapply(FUN = FUN, nX = nX, chunk_args = X, args = list(...),
. get_chunk = `chunkWith[[`, expr = expr, envir = envir, future.envir = future.envir,
. future.globals = future.globals, future.packages = future.packages,
. future.scheduling = future.scheduling, future.chunk.size = future.chunk.size,
. future.stdout = future.stdout, future.conditions = future.conditions,
. future.seed = future.seed, future.label = future.label, fcn_name = fcn_name,
. args_name = args_name, debug = debug)
5. withCallingHandlers({
. values <- local({
. oopts <- options(future.rng.onMisuse.keepFuture = FALSE)
. on.exit(options(oopts))
. value(fs)
. })
. }, RngFutureCondition = function(cond) {
. idx <- NULL
. uuid <- attr(cond, "uuid")
. if (!is.null(uuid)) {
. for (kk in seq_along(fs)) {
. if (identical(fs[[kk]]$uuid, uuid))
. idx <- kk
. }
. }
. else {
. f <- attr(cond, "future")
. if (is.null(f))
. return()
. if (!isFALSE(f$seed))
. return()
. for (kk in seq_along(fs)) {
. if (identical(fs[[kk]], f))
. idx <- kk
. }
. }
. if (is.null(idx))
. return()
. f <- fs[[idx]]
. label <- f$label
. if (is.null(label))
. label <- "<none>"
. message <- sprintf("UNRELIABLE VALUE: One of the %s iterations (%s) unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set option 'future.rng.onMisuse' to \"ignore\".",
. sQuote(.packageName), sQuote(label))
. cond$message <- message
. if (inherits(cond, "warning")) {
. warning(cond)
. invokeRestart("muffleWarning")
. }
. else if (inherits(cond, "error")) {
. stop(cond)
. }
. })
6. (function() {
. oopts <- options(future.rng.onMisuse.keepFuture = FALSE)
. on.exit(options(oopts))
. value(fs)
. })()
7. value(fs)
8. value.list(fs)
9. resolve(y, result = TRUE, stdout = stdout, signal = signal, force = TRUE)
10. resolve.list(y, result = TRUE, stdout = stdout, signal = signal,
. force = TRUE)
11. signalConditionsASAP(obj, resignal = FALSE, pos = ii)
12. signalConditions(obj, exclude = getOption("future.relay.immediate",
. "immediateCondition"), resignal = resignal, ...)
Are you able to reproduce either of these errors with the tutorial dataset? It is hard for me to resolve the problem if I cannot reproduce it.
Closing this issue due to inactivity.
I'm using the
seurat_obj
There are a totally of 41 modules, identified with this. I ran
RunModuleUMAP
function and then usedModuleUMAPPlot
function to show 2 hub gene names from each module.It shows the following message:
And when I checked:
This showed 4450 genes containing missing values (NA) in the
kME column of umap_df
, but still they are assigned to some Modules.