smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
316 stars 31 forks source link

SetupForWGCNA: invalid number of intervals #6

Closed KoichiHashikawa closed 2 years ago

KoichiHashikawa commented 2 years ago

Hello Sam,

Thanks for the development of fantastic analysis package!

I wanted to try your scWGCNA in our own data. Although all the codes in scWGCNA basics ran successfully using Zhou_2020 data, I got the following errors when I ran our own data.

Let us know your thoughts in your convenience. Thanks so much in advance!

1) SetupForWGCNA when I set the geneselection method as "fraction", I received "Error in cut.default(1:nrow(expr_mat), n_chunks): invalid number of intervals". Interestingly, it ran when I set it as "variable".

2) SetDatExpr Even when I ran SetupForWGCNA using "variable" setting, SetDatExpr gave an error [1] "n_genes:" [1] 10000 [1] "Avp" "Slc5a7" "Ngfr" "Slc10a4" "Penk" "Vip"
[1] 534 5 [1] 0 5 [1] 0 5 [1] "cells:" character(0) [1] 0 Error in WGCNA::goodGenes(datExpr, ...): datExpr must contain numeric data. Traceback:

I compared the str(Zhou data Seurat object) and str(our Seurat object), but nothing catches my eyes so far.

KoichiHashikawa commented 2 years ago

Just in case this is the str output of our Seurat object. Formal class 'Seurat' [package "SeuratObject"] with 13 slots ..@ assays :List of 1 .. ..$ RNA:Formal class 'Assay' [package "SeuratObject"] with 8 slots .. .. .. ..@ counts :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots .. .. .. .. .. ..@ i : int [1:4901113] 8 35 37 46 67 81 91 92 93 151 ... .. .. .. .. .. ..@ p : int [1:1770] 0 695 1262 2139 2951 3774 5215 5840 6908 7675 ... .. .. .. .. .. ..@ Dim : int [1:2] 18462 1769 .. .. .. .. .. ..@ Dimnames:List of 2 .. .. .. .. .. .. ..$ : chr [1:18462] "Xkr4" "Rp1" "Sox17" "Mrpl15" ... .. .. .. .. .. .. ..$ : chr [1:1769] "AAATGCCTCTCTGAGAAM" "AACCGCGTCGCCATAAAM" "AACGTTGAGTGAAGTTAM" "AACTCCCCAGGGAGAGAM" ... .. .. .. .. .. ..@ x : num [1:4901113] 1 1 1 1 1 1 1 1 1 1 ... .. .. .. .. .. ..@ factors : list() .. .. .. ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots .. .. .. .. .. ..@ i : int [1:4901113] 8 35 37 46 67 81 91 92 93 151 ... .. .. .. .. .. ..@ p : int [1:1770] 0 695 1262 2139 2951 3774 5215 5840 6908 7675 ... .. .. .. .. .. ..@ Dim : int [1:2] 18462 1769 .. .. .. .. .. ..@ Dimnames:List of 2 .. .. .. .. .. .. ..$ : chr [1:18462] "Xkr4" "Rp1" "Sox17" "Mrpl15" ... .. .. .. .. .. .. ..$ : chr [1:1769] "AAATGCCTCTCTGAGAAM" "AACCGCGTCGCCATAAAM" "AACGTTGAGTGAAGTTAM" "AACTCCCCAGGGAGAGAM" ... .. .. .. .. .. ..@ x : num [1:4901113] 2.33 2.33 2.33 2.33 2.33 ... .. .. .. .. .. ..@ factors : list() .. .. .. ..@ scale.data : num [1:2000, 1:1769] 3.4204 -0.1194 -0.2196 -0.032 -0.0238 ... .. .. .. .. ..- attr(, "dimnames")=List of 2 .. .. .. .. .. ..$ : chr [1:2000] "Oprk1" "St18" "3110035E14Rik" "1700034P13Rik" ... .. .. .. .. .. ..$ : chr [1:1769] "AAATGCCTCTCTGAGAAM" "AACCGCGTCGCCATAAAM" "AACGTTGAGTGAAGTTAM" "AACTCCCCAGGGAGAGAM" ... .. .. .. ..@ key : chr "rna_" .. .. .. ..@ assay.orig : NULL .. .. .. ..@ var.features : chr [1:2000] "Sst" "Trh" "Plp1" "Avp" ... .. .. .. ..@ meta.features:'data.frame': 18462 obs. of 5 variables: .. .. .. .. ..$ vst.mean : num [1:18462] 0.056716 0 0.000299 0.165075 0.204776 ... .. .. .. .. ..$ vst.variance : num [1:18462] 0.060085 0 0.000299 0.170115 0.220819 ... .. .. .. .. ..$ vst.variance.expected : num [1:18462] 0.062604 0 0.000298 0.185673 0.232861 ... .. .. .. .. ..$ vst.variance.standardized: num [1:18462] 0.96 0 1.001 0.916 0.948 ... .. .. .. .. ..$ vst.variable : logi [1:18462] FALSE FALSE FALSE FALSE FALSE FALSE ... .. .. .. ..@ misc : list() ..@ meta.data :'data.frame': 1769 obs. of 8 variables: .. ..$ orig.ident : Factor w/ 1 level "10X_MPOA": 1 1 1 1 1 1 1 1 1 1 ... .. ..$ nCount_RNA : num [1:1769] 1074 860 1305 1165 1329 ... .. ..$ nFeature_RNA : int [1:1769] 695 567 877 812 823 1441 625 1068 767 826 ... .. ..$ stim : chr [1:1769] "AM" "AM" "AM" "AM" ... .. ..$ percent.mito : num [1:1769] 0.0903 0.036 0.1234 0.1073 0.158 ... .. ..$ RNA_snn_res.0.8: Factor w/ 19 levels "0","1","2","3",..: 1 1 1 1 1 1 1 1 1 1 ... .. ..$ seurat_clusters: Factor w/ 19 levels "0","1","2","3",..: 1 1 1 1 1 1 1 1 1 1 ... .. ..$ celltype : Factor w/ 36 levels "Mix1","Vgat1",..: 2 29 12 2 2 2 12 5 29 2 ... ..@ active.assay: chr "RNA" ..@ active.ident: Factor w/ 20 levels "Vgat1","Vgat2",..: 1 17 6 1 1 1 6 3 17 1 ... .. ..- attr(, "names")= chr [1:1769] "AAATGCCTCTCTGAGAAM" "AACCGCGTCGCCATAAAM" "AACGTTGAGTGAAGTTAM" "AACTCCCCAGGGAGAGAM" ... ..@ graphs : list() ..@ neighbors : list() ..@ reductions :List of 2 .. ..$ pca :Formal class 'DimReduc' [package "SeuratObject"] with 9 slots .. .. .. ..@ cell.embeddings : num [1:1769, 1:40] 14.8 12.9 15.8 15.3 14.9 ... .. .. .. .. ..- attr(, "dimnames")=List of 2 .. .. .. .. .. ..$ : chr [1:1769] "AAATGCCTCTCTGAGAAM" "AACCGCGTCGCCATAAAM" "AACGTTGAGTGAAGTTAM" "AACTCCCCAGGGAGAGAM" ... .. .. .. .. .. ..$ : chr [1:40] "PC_1" "PC_2" "PC_3" "PC_4" ... .. .. .. ..@ feature.loadings : num [1:2000, 1:40] -0.002071 -0.000383 -0.00658 0.002928 -0.001845 ... .. .. .. .. ..- attr(, "dimnames")=List of 2 .. .. .. .. .. ..$ : chr [1:2000] "Sst" "Trh" "Plp1" "Avp" ... .. .. .. .. .. ..$ : chr [1:40] "PC_1" "PC_2" "PC_3" "PC4" ... .. .. .. ..@ feature.loadings.projected: num[0 , 0 ] .. .. .. ..@ assay.used : chr "RNA" .. .. .. ..@ global : logi FALSE .. .. .. ..@ stdev : num [1:40] 9.11 4.93 4.09 3.8 3.28 ... .. .. .. ..@ key : chr "PC" .. .. .. ..@ jackstraw :Formal class 'JackStrawData' [package "SeuratObject"] with 4 slots .. .. .. .. .. ..@ empirical.p.values : num[0 , 0 ] .. .. .. .. .. ..@ fake.reduction.scores : num[0 , 0 ] .. .. .. .. .. ..@ empirical.p.values.full: num[0 , 0 ] .. .. .. .. .. ..@ overall.p.values : num[0 , 0 ] .. .. .. ..@ misc :List of 1 .. .. .. .. ..$ total.variance: num 1606 .. ..$ umap:Formal class 'DimReduc' [package "SeuratObject"] with 9 slots .. .. .. ..@ cell.embeddings : num [1:1769, 1:2] -13.8 -13 -13.9 -13.3 -12.3 ... .. .. .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. .. .. ..$ : chr [1:1769] "AAATGCCTCTCTGAGAAM" "AACCGCGTCGCCATAAAM" "AACGTTGAGTGAAGTTAM" "AACTCCCCAGGGAGAGAM" ... .. .. .. .. .. ..$ : chr [1:2] "UMAP_1" "UMAP2" .. .. .. ..@ feature.loadings : num[0 , 0 ] .. .. .. ..@ feature.loadings.projected: num[0 , 0 ] .. .. .. ..@ assay.used : chr "RNA" .. .. .. ..@ global : logi TRUE .. .. .. ..@ stdev : num(0) .. .. .. ..@ key : chr "UMAP" .. .. .. ..@ jackstraw :Formal class 'JackStrawData' [package "SeuratObject"] with 4 slots .. .. .. .. .. ..@ empirical.p.values : num[0 , 0 ] .. .. .. .. .. ..@ fake.reduction.scores : num[0 , 0 ] .. .. .. .. .. ..@ empirical.p.values.full: num[0 , 0 ] .. .. .. .. .. ..@ overall.p.values : num[0 , 0 ] .. .. .. ..@ misc : list() ..@ images : list() ..@ project.name: chr "10X_MPOA" ..@ misc : list() ..@ version :Classes 'package_version', 'numeric_version' hidden list of 1 .. ..$ : int [1:3] 4 0 4 ..@ commands :List of 7 .. ..$ NormalizeData.RNA :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots .. .. .. ..@ name : chr "NormalizeData.RNA" .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2022-04-06 07:01:25" .. .. .. ..@ assay.used : chr "RNA" .. .. .. ..@ call.string: chr "NormalizeData(object = AM, verbose = FALSE)" .. .. .. ..@ params :List of 5 .. .. .. .. ..$ assay : chr "RNA" .. .. .. .. ..$ normalization.method: chr "LogNormalize" .. .. .. .. ..$ scale.factor : num 10000 .. .. .. .. ..$ margin : num 1 .. .. .. .. ..$ verbose : logi FALSE .. ..$ FindVariableFeatures.RNA:Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots .. .. .. ..@ name : chr "FindVariableFeatures.RNA" .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2022-04-06 07:01:25" .. .. .. ..@ assay.used : chr "RNA" .. .. .. ..@ call.string: chr [1:2] "FindVariableFeatures(object = AM, selection.method = \"vst\", " " nfeatures = 2000, verbose = FALSE)" .. .. .. ..@ params :List of 12 .. .. .. .. ..$ assay : chr "RNA" .. .. .. .. ..$ selection.method : chr "vst" .. .. .. .. ..$ loess.span : num 0.3 .. .. .. .. ..$ clip.max : chr "auto" .. .. .. .. ..$ mean.function :function (mat, display_progress)
.. .. .. .. ..$ dispersion.function:function (mat, display_progress)
.. .. .. .. ..$ num.bin : num 20 .. .. .. .. ..$ binning.method : chr "equalwidth" .. .. .. .. ..$ nfeatures : num 2000 .. .. .. .. ..$ mean.cutoff : num [1:2] 0.1 8 .. .. .. .. ..$ dispersion.cutoff : num [1:2] 1 Inf .. .. .. .. ..$ verbose : logi FALSE .. ..$ ScaleData.RNA :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots .. .. .. ..@ name : chr "ScaleData.RNA" .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2022-04-06 07:01:26" .. .. .. ..@ assay.used : chr "RNA" .. .. .. ..@ call.string: chr "ScaleData(object = AM, verbose = FALSE)" .. .. .. ..@ params :List of 10 .. .. .. .. ..$ features : chr [1:2000] "Sst" "Trh" "Plp1" "Avp" ... .. .. .. .. ..$ assay : chr "RNA" .. .. .. .. ..$ model.use : chr "linear" .. .. .. .. ..$ use.umi : logi FALSE .. .. .. .. ..$ do.scale : logi TRUE .. .. .. .. ..$ do.center : logi TRUE .. .. .. .. ..$ scale.max : num 10 .. .. .. .. ..$ block.size : num 1000 .. .. .. .. ..$ min.cells.to.block: num 3000 .. .. .. .. ..$ verbose : logi FALSE .. ..$ RunPCA.RNA :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots .. .. .. ..@ name : chr "RunPCA.RNA" .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2022-04-06 07:01:26" .. .. .. ..@ assay.used : chr "RNA" .. .. .. ..@ call.string: chr "RunPCA(object = AM, npcs = 40, verbose = FALSE)" .. .. .. ..@ params :List of 10 .. .. .. .. ..$ assay : chr "RNA" .. .. .. .. ..$ npcs : num 40 .. .. .. .. ..$ rev.pca : logi FALSE .. .. .. .. ..$ weight.by.var : logi TRUE .. .. .. .. ..$ verbose : logi FALSE .. .. .. .. ..$ ndims.print : int [1:5] 1 2 3 4 5 .. .. .. .. ..$ nfeatures.print: num 30 .. .. .. .. ..$ reduction.name : chr "pca" .. .. .. .. ..$ reduction.key : chr "PC" .. .. .. .. ..$ seed.use : num 42 .. ..$ RunUMAP.RNA.pca :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots .. .. .. ..@ name : chr "RunUMAP.RNA.pca" .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2022-04-06 07:01:32" .. .. .. ..@ assay.used : chr "RNA" .. .. .. ..@ call.string: chr "RunUMAP(object = AM, reduction = \"pca\", dims = 1:40)" .. .. .. ..@ params :List of 26 .. .. .. .. ..$ dims : int [1:40] 1 2 3 4 5 6 7 8 9 10 ... .. .. .. .. ..$ reduction : chr "pca" .. .. .. .. ..$ assay : chr "RNA" .. .. .. .. ..$ slot : chr "data" .. .. .. .. ..$ umap.method : chr "uwot" .. .. .. .. ..$ return.model : logi FALSE .. .. .. .. ..$ n.neighbors : int 30 .. .. .. .. ..$ n.components : int 2 .. .. .. .. ..$ metric : chr "cosine" .. .. .. .. ..$ learning.rate : num 1 .. .. .. .. ..$ min.dist : num 0.3 .. .. .. .. ..$ spread : num 1 .. .. .. .. ..$ set.op.mix.ratio : num 1 .. .. .. .. ..$ local.connectivity : int 1 .. .. .. .. ..$ repulsion.strength : num 1 .. .. .. .. ..$ negative.sample.rate: int 5 .. .. .. .. ..$ uwot.sgd : logi FALSE .. .. .. .. ..$ seed.use : int 42 .. .. .. .. ..$ angular.rp.forest : logi FALSE .. .. .. .. ..$ densmap : logi FALSE .. .. .. .. ..$ dens.lambda : num 2 .. .. .. .. ..$ dens.frac : num 0.3 .. .. .. .. ..$ dens.var.shift : num 0.1 .. .. .. .. ..$ verbose : logi TRUE .. .. .. .. ..$ reduction.name : chr "umap" .. .. .. .. ..$ reduction.key : chr "UMAP_" .. ..$ FindNeighbors.RNA.pca :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots .. .. .. ..@ name : chr "FindNeighbors.RNA.pca" .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2022-04-06 07:01:32" .. .. .. ..@ assay.used : chr "RNA" .. .. .. ..@ call.string: chr "FindNeighbors(object = AM)" .. .. .. ..@ params :List of 17 .. .. .. .. ..$ reduction : chr "pca" .. .. .. .. ..$ dims : int [1:10] 1 2 3 4 5 6 7 8 9 10 .. .. .. .. ..$ assay : chr "RNA" .. .. .. .. ..$ k.param : num 20 .. .. .. .. ..$ return.neighbor: logi FALSE .. .. .. .. ..$ compute.SNN : logi TRUE .. .. .. .. ..$ prune.SNN : num 0.0667 .. .. .. .. ..$ nn.method : chr "annoy" .. .. .. .. ..$ n.trees : num 50 .. .. .. .. ..$ annoy.metric : chr "euclidean" .. .. .. .. ..$ nn.eps : num 0 .. .. .. .. ..$ verbose : logi TRUE .. .. .. .. ..$ force.recalc : logi FALSE .. .. .. .. ..$ do.plot : logi FALSE .. .. .. .. ..$ graph.name : chr [1:2] "RNA_nn" "RNA_snn" .. .. .. .. ..$ l2.norm : logi FALSE .. .. .. .. ..$ cache.index : logi FALSE .. ..$ FindClusters :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots .. .. .. ..@ name : chr "FindClusters" .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2022-04-06 07:01:32" .. .. .. ..@ assay.used : chr "RNA" .. .. .. ..@ call.string: chr "FindClusters(AM, resolution = 0.8)" .. .. .. ..@ params :List of 10 .. .. .. .. ..$ graph.name : chr "RNA_snn" .. .. .. .. ..$ modularity.fxn : num 1 .. .. .. .. ..$ resolution : num 0.8 .. .. .. .. ..$ method : chr "matrix" .. .. .. .. ..$ algorithm : num 1 .. .. .. .. ..$ n.start : num 10 .. .. .. .. ..$ n.iter : num 10 .. .. .. .. ..$ random.seed : num 0 .. .. .. .. ..$ group.singletons: logi TRUE .. .. .. .. ..$ verbose : logi TRUE ..@ tools : list()

smorabit commented 2 years ago

Hi Koichi,

I ran into a similar problem with SetupForWGCNA using gene_select='fraction' a couple of days ago, and I think that I fixed the function (at least on my data), could you please try re-installing scWGCNA and running that part again?

For your second issue, I think you have a problem with how you are running SetDatExpr. Check and make sure that whatever you have put for group_name is one of the values in the group.by column in your Seurat meta data. For example, In the tutorial on the Zhou dataset we call SetDatExpr like this:

seurat_obj <- SetDatExpr(
  seurat_obj,
  group_name = "INH", 
  group.by='cell_type'
)

group.by should be the name of a meta data column in your Seurat object that you also passed to the group.by argument in the MetacellsByGroups function. group_name should be one of the groups in this column.

Hopefully this helps, let me know if this resolves your issue.

KoichiHashikawa commented 2 years ago

Thanks so much after updating the scWGCNA, it all worked out!

smorabit commented 2 years ago

I just added a small update to SetDatExpr that will now throw a more informative error for this sort of issue.

Tianqi-Ma commented 1 year ago

Hi, Sam

I got a same error today.

Error in cut.default(1:nrow(expr_mat), n_chunks) :
  invalid number of intervals
Calls: SetupForWGCNA -> SelectNetworkGenes -> cut -> cut.default
Execution halted

It's weird becasue I ran hdWGCNA on another data successfully. The only difference on these two datasets is one is integrated by Seurat (failed) and the other is simply merged (succeed). Both of them are made by 6 individual datasets.

Yakun-Pang commented 1 year ago

Hi, Sam

I got the same error today.

Error in cut.default(1:nrow(expr_mat), n_chunks) :
  invalid number of intervals
Calls: SetupForWGCNA -> SelectNetworkGenes -> cut -> cut.default
Execution halted

It's weird becasue I ran hdWGCNA on another data successfully. The only difference on these two datasets is one is integrated by Seurat (failed) and the other is simply merged (succeed). Both of them are made by 6 individual datasets.

Hi Tianqi-Ma, Maybe this is too late, but I ran into exactly the same issue and just fixed it. I figured out that the 'IntegrateData' function of Seurat sets the default assay of the Seurat object as 'Integrated', which only has feature genes(2000 genes in my case). So, just simply re-set the default assay for your Seurat object by:

DefaultAssay(object = your_seurat_obj) <- "RNA"