Closed GettyScience closed 11 months ago
Hi,
Thank you for your interest in multiWGCNA. You're absolutely right to follow the vignettes to check the format of the input data.
For the error, it appears to be an issue with your ExperimentHub version. I have solved this in the past by updating to the latest version of Bioconductor and reinstalling ExperimentHub. You may need to update R/RStudio in order to update Bioconductor.
To answer your other question about the input data, the input data can be either a SummarizedExperiment object (it will use the first assay in the object so make sure this is your expression data). It can also be a data.frame with genes as rows and samples as columns. Make sure the sample names match those in the sample table.
Thank you for getting back to me so quickly.
Will my data.frame include the genes as rows and all samples of interest (regardless of treatment, named accordingly) as columns? What will the Sample Table look like? Will it be samples as rows and sample type as columns?
I have checked all my versions and everything on my end is up to date with the error not resolved.
Interesting. What is your snapshot date for ExperimentHub when you do:
library(ExperimentHub)
eh = ExperimentHub()
I get a snapshot date of 2023-07-18, which is after when multiWGCNAdata was added to ExperimentHub on May 20th, 2023.
Right, the datExpr input will need to be data.frame with genes as rows and samples as columns. The sampleTable will need to be a data.frame with the first column being the samples (ie columns from datExpr), and then should have two other columns with your variables of interest (ie disease and region). The rows of this sampleTable do not need to be in any particular order.
Hello, I have hit a new roadblock. In trying to used my data with the constructNetworks() function, I get the error:
Error in constructNetworks(MWGCNA_Data, SamplesTable, conditions1, conditions2, : inherits(datExpr, "SummarizedExperiment") | inherits(datExpr, .... is not TRUE
I have the data.frame set up as you described: genes | sample1 | sample 2 | sample 3| ... sample 16 g1 g2 g3 g4 ... g27000
The sampleTable is set up as: samples | genotype | gravity sample 1 | pgm | vertical sample 2 | pgm | vertical sample 3 | pgm | vertical ... sample 16 | col | treated
Would the error have anything to do with the fact the data.frames are imported as excel files?
Thank you,
Sorry, I found out about this bug only today. Please re-install multiWGCNA:
devtools::install_github("fogellab/multiWGCNA", force = TRUE)
That should fix it.
I do encourage you to go through the vignettes, so try to get those files from ExperimentHub if you can!
Is 2 options okay for each treatment? I have a new error:
Error in if (x == trait) { : the condition has length > 1
Hello, I am again checking on this error. Thank you,
Hi again!
Let's tackle one error at a time. I have a feeling solving the first one will help us with this new error. Can you try these lines and show me the message?
library(ExperimentHub)
eh = ExperimentHub()
While we're at it, can you also print your full sampleTable for me and paste it here?
Lastly, please show me your sessionInfo, with the ExperimentHub package loaded of course. Like this:
> sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.4.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] multiWGCNA_0.99.2 ggalluvial_0.12.5 ggplot2_3.4.2 SummarizedExperiment_1.31.1 Biobase_2.61.0 GenomicRanges_1.53.1
[7] GenomeInfoDb_1.37.1 IRanges_2.35.1 S4Vectors_0.39.1 MatrixGenerics_1.13.0 matrixStats_0.63.0 multiWGCNAdata_0.99.1
[13] ExperimentHub_2.9.0 AnnotationHub_3.9.1 BiocFileCache_2.9.0 dbplyr_2.3.2 BiocGenerics_0.47.0
loaded via a namespace (and not attached):
[1] rstudioapi_0.14 magrittr_2.0.3 rmarkdown_2.21 zlibbioc_1.47.0 vctrs_0.6.2
[6] memoise_2.0.1 RCurl_1.98-1.12 base64enc_0.1-3 htmltools_0.5.5 S4Arrays_1.1.4
[11] dynamicTreeCut_1.63-1 curl_5.0.0 SparseArray_1.1.6 Formula_1.2-5 htmlwidgets_1.6.2
[16] plyr_1.8.8 impute_1.75.1 cachem_1.0.8 igraph_1.4.3 mime_0.12
[21] lifecycle_1.0.3 iterators_1.0.14 pkgconfig_2.0.3 Matrix_1.5-4.1 R6_2.5.1
[26] fastmap_1.1.1 GenomeInfoDbData_1.2.10 shiny_1.7.4 digest_0.6.31 colorspace_2.1-0
[31] patchwork_1.1.2 AnnotationDbi_1.63.1 Hmisc_5.1-0 RSQLite_2.3.1 vegan_2.6-4
[36] filelock_1.0.2 fansi_1.0.4 httr_1.4.6 mgcv_1.8-42 compiler_4.3.0
[41] rngtools_1.5.2 bit64_4.0.5 withr_2.5.0 doParallel_1.0.17 htmlTable_2.4.1
[46] backports_1.4.1 DBI_1.1.3 MASS_7.3-60 rappdirs_0.3.3 DelayedArray_0.27.3
[51] permute_0.9-7 flashClust_1.01-2 tools_4.3.0 foreign_0.8-84 interactiveDisplayBase_1.39.0
[56] httpuv_1.6.11 nnet_7.3-19 glue_1.6.2 nlme_3.1-162 promises_1.2.0.1
[61] grid_4.3.0 checkmate_2.2.0 cluster_2.1.4 reshape2_1.4.4 generics_0.1.3
[66] gtable_0.3.3 tzdb_0.4.0 preprocessCore_1.63.1 hms_1.1.3 data.table_1.14.8
[71] WGCNA_1.72-1 utf8_1.2.3 XVector_0.41.1 ggrepel_0.9.3 BiocVersion_3.18.0
[76] foreach_1.5.2 pillar_1.9.0 stringr_1.5.0 later_1.3.1 splines_4.3.0
[81] dplyr_1.1.2 lattice_0.21-8 survival_3.5-5 bit_4.0.5 tidyselect_1.2.0
[86] GO.db_3.17.0 Biostrings_2.69.1 knitr_1.42 gridExtra_2.3 xfun_0.39
[91] stringi_1.7.12 yaml_2.3.7 evaluate_0.21 codetools_0.2-19 tibble_3.2.1
[96] BiocManager_1.30.20 cli_3.6.1 rpart_4.1.19 xtable_1.8-4 munsell_0.5.0
[101] Rcpp_1.0.10 png_0.1-8 fastcluster_1.2.3 parallel_4.3.0 ellipsis_0.3.2
[106] readr_2.1.4 blob_1.2.4 dcanr_1.17.0 doRNG_1.8.6 bitops_1.0-7
[111] scales_1.2.1 purrr_1.0.1 crayon_1.5.2 rlang_1.1.1 cowplot_1.1.1
[116] KEGGREST_1.41.0
Thanks!
COLTRT_1 | col | treatment | ||
---|---|---|---|---|
COLTRT_2 | col | treatment | ||
COLTRT_3 | col | treatment | ||
COLTRT_4 | col | treatment | ||
COLVRT_1 | col | vertical | ||
COLVRT_2 | col | vertical | ||
COLVRT_3 | col | vertical | ||
COLVRT_4 | col | vertical | ||
PGMTRT_1 | pgm | treatment | ||
PGMTRT_2 | pgm | treatment | ||
PGMTRT_3 | pgm | treatment | ||
PGMTRT_4 | pgm | treatment | ||
PGMVRT_1 | pgm | vertical | ||
PGMVRT_2 | pgm | vertical | ||
PGMVRT_3 | pgm | vertical | ||
PGMVRT_4 | pgm | vertical |
R version 4.3.1 (2023-06-16) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Ventura 13.4.1
Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: US/Pacific tzcode source: internal
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ExperimentHub_2.8.1 AnnotationHub_3.8.0 BiocFileCache_2.8.0 dbplyr_2.3.3
[5] BiocGenerics_0.46.0 BiocManager_1.30.21.1 multiWGCNA_0.99.2 ggalluvial_0.12.5
[9] ggplot2_3.4.2
loaded via a namespace (and not attached):
[1] rstudioapi_0.15.0 magrittr_2.0.3 rmarkdown_2.23
[4] fs_1.6.3 zlibbioc_1.46.0 vctrs_0.6.3
[7] memoise_2.0.1 RCurl_1.98-1.12 base64enc_0.1-3
[10] htmltools_0.5.5 S4Arrays_1.0.4 usethis_2.2.2
[13] curl_5.0.1 dynamicTreeCut_1.63-1 Formula_1.2-5
[16] htmlwidgets_1.6.2 impute_1.74.1 cachem_1.0.8
[19] igraph_1.5.0 mime_0.12 lifecycle_1.0.3
[22] iterators_1.0.14 pkgconfig_2.0.3 Matrix_1.6-0
[25] R6_2.5.1 fastmap_1.1.1 GenomeInfoDbData_1.2.10
[28] MatrixGenerics_1.12.2 shiny_1.7.4.1 digest_0.6.33
[31] colorspace_2.1-0 patchwork_1.1.2 AnnotationDbi_1.62.2
[34] S4Vectors_0.38.1 ps_1.7.5 pkgload_1.3.2.1
[37] Hmisc_5.1-0 GenomicRanges_1.52.0 RSQLite_2.3.1
[40] filelock_1.0.2 fansi_1.0.4 httr_1.4.6
[43] compiler_4.3.1 rngtools_1.5.2 remotes_2.4.2.1
[46] bit64_4.0.5 withr_2.5.0 doParallel_1.0.17
[49] htmlTable_2.4.1 backports_1.4.1 DBI_1.1.3
[52] pkgbuild_1.4.2 rappdirs_0.3.3 DelayedArray_0.26.6
[55] sessioninfo_1.2.2 flashClust_1.01-2 tools_4.3.1
[58] foreign_0.8-84 interactiveDisplayBase_1.38.0 httpuv_1.6.11
[61] nnet_7.3-19 glue_1.6.2 callr_3.7.3
[64] promises_1.2.0.1 grid_4.3.1 checkmate_2.2.0
[67] cluster_2.1.4 generics_0.1.3 gtable_0.3.3
[70] tzdb_0.4.0 preprocessCore_1.62.1 data.table_1.14.8
[73] hms_1.1.3 WGCNA_1.72-1 utf8_1.2.3
[76] XVector_0.40.0 BiocVersion_3.17.1 ggrepel_0.9.3
[79] foreach_1.5.2 pillar_1.9.0 stringr_1.5.0
[82] later_1.3.1 splines_4.3.1 dplyr_1.1.2
[85] lattice_0.21-8 survival_3.5-5 bit_4.0.5
[88] tidyselect_1.2.0 GO.db_3.17.0 Biostrings_2.68.1
[91] miniUI_0.1.1.1 knitr_1.43 gridExtra_2.3
[94] IRanges_2.34.1 SummarizedExperiment_1.30.2 stats4_4.3.1
[97] xfun_0.39 Biobase_2.60.0 devtools_2.4.5
[100] matrixStats_1.0.0 stringi_1.7.12 yaml_2.3.7
[103] evaluate_0.21 codetools_0.2-19 tibble_3.2.1
[106] cli_3.6.1 rpart_4.1.19 xtable_1.8-4
[109] munsell_0.5.0 processx_3.8.2 Rcpp_1.0.11
[112] GenomeInfoDb_1.36.1 png_0.1-8 fastcluster_1.2.3
[115] parallel_4.3.1 ellipsis_0.3.2 readr_2.1.4
[118] blob_1.2.4 prettyunits_1.1.1 dcanr_1.16.0
[121] doRNG_1.8.6 profvis_0.3.8 urlchecker_1.0.1
[124] bitops_1.0-7 scales_1.2.1 purrr_1.0.1
[127] crayon_1.5.2 rlang_1.1.1 cowplot_1.1.1
[130] KEGGREST_1.40.0
Okay, this should be easy. You need a newer version of ExperimentHub (>2.9.0):
BiocManager::install("ExperimentHub", force = TRUE)
And then re-try accessing the ExperimentHub and you should get a snapshot of today, ie:
> library(ExperimentHub)
> eh = ExperimentHub()
snapshotDate(): 2023-07-21
Still unsuccessful
I'm guessing that's still ExperimentHub version 2.8.1?
Let's try installing the development version of Bioconductor:
BiocManager::install(version = "3.18")
If that works, you can re-try installing ExperimentHub, and it should be the newest version:
BiocManager::install("ExperimentHub", force = TRUE)
Unsuccessful. Could the issue have something to do with using one of the new M2 Mac and not through an intel processor?
Which part was unsuccessful? Can I see the error message?
There is no error message, it just returns the snapshot shown in the previous photo each time.
Running on PositCloud, it is now successful. It must be an issue with my version (somehow) or processor.
Huh, very weird. Happy it's working now! Let me know if you get that error with the sampleTable again.
I have not tried by own data as of yet, just going through the vignette. I will try my own soon. Your help is much appreciated, thank you so much.
Running my data through results in this error
Error in constructNetworks(multiWGCNAdata, sampleTable, conditions1, conditions2, : inherits(datExpr, "SummarizedExperiment") | inherits(datExpr, .... is not TRUE
What class is your datExpr object?
class(datExpr)
It should be either a SummarizedExperiment object or a data.frame object. Anything else will hit that stopifnot clause.
You might also want to check the version of multiWGCNA since this was a bug that I fixed earlier this week if you recall.
[1] "tbl_df" "tbl" "data.frame"
I forced a download from dev.tools. Is there a version I could specifically feed it?
This is the full error script
3. stop(simpleError(msg, call = if (p <- sys.parent(1L)) sys.call(p)))
stopifnot(inherits(datExpr, "SummarizedExperiment") \ | inherits(datExpr, "SummarizedExperiment")) |
---|
constructNetworks(multiWGCNAdata, SampleTable, conditions1, conditions2, networkType = "signed", power = 18, minModuleSize = 40, maxBlockSize = 25000, reassignThreshold = 0, minKMEtoStay = 0.7, mergeCutHeight = 0.1, numericLabels = TRUE, pamRespectsDendro = FALSE, verbose = 3) |
---|
Yep, looks like its the old buggy version for some reason. It should be multiWGCNA version 0.99.2. You already tried the force = TRUE from devtools? I think that should update it to the newest development version.
For now, the workaround can be to make your datExpr a SummarizedExperiment object. Something like this:
se = SummarizedExperiment(assays=list(counts=as.matrix(datExpr)))
That helped. I know I keep coming to you for help and it has been very appreciated.
New error:
Check that your SummarizedExperiment object has colnames and rownames. Looks like its complaining that it cannot find rownames for it.
For example, I made a dummy example:
> temp
sample1 sample2 sample3 sample4 sample5
op1 1 1 1 1 1
op2 1 1 1 1 1
op3 1 1 1 1 1
op4 1 1 1 1 1
> se = SummarizedExperiment(assays=list(counts = temp))
> se
class: SummarizedExperiment
dim: 4 5
metadata(0):
assays(1): counts
rownames(4): op1 op2 op3 op4
rowData names(0):
colnames(5): sample1 sample2 sample3 sample4 sample5
colData names(0):
Yours should also have rownames and colnames.
It was my error, the dataExpr did not have the row names column as actual row names.
Going further I hit the error:
Error: subscript contains out-of-bounds indices
Just to be sure, did the vignettes work for you? Because if that's the case then it's just a matter of putting all the data in the same format as the vignettes.
Yes! And thanks to you, of which much is required of me, I did trouble shoot it and get the program to run. I did hit a problem, however:
softConnectivity: FYI: connecitivty of genes with less than 6 valid samples will be returned as NA. ..calculating connectivities....100% Error in datExpr[, !colnames(datExpr) %in% c("X", "kTotal", "kWithin", : incorrect number of dimensions
Our data only derived from 4 samples per genotype/tissue/treatment. I could potentially add other plant tissue data to increase the number of samples for this analysis of genotype/treatment, but I would prefer to just avoid that noise at the moment.
Is there a way to change the requirement of 6 valid samples to 3 or 4?
Thank you,
The 6 sample remark is just a warning.
Please email me your datExpr and sampleTable in .csv format to dtommasini0@gmail.com and I'll take a look. This error has already been reported, but the thread has fallen silent and I don't know if it was resolved.
I have reproduced the error as well. Taking a look right now.
The issue was that your first column in the sampleTable is "sample" and not "Sample". It ran fine after:
colnames(sampleTable)[1] = "Sample"
I actually forgot that this was required, but I've updated the documentation to reflect this. Future versions might not be so picky with formatting, but for now try this easy fix.
Also, please do work through both vignettes as the astrocyte vignette as some analyses not covered by the autism workflow.
PLEASE HELP STOP PLAGIARISM AND VILLAINOUS SCIENTISTS BRENT FOGEL (UCLA) AND DARIO TOMMASINI (now PhD student in UC Berkeley) BY NOT CITING OR USING THIS, OR THEIR OTHER CODES AND PAPERS BY THEM. INSTEAD USE AND CITE THE ORIGINAL WORKS (WITH VIDEO TUTORIAL) BY DR. STEVE HORVATH, DR. PETER LANGFELDER AND DR. JEREMY MILLER.(details below)
*For details about wrong doings by Brent Fogel including and not limited to plagiarism by Brent Fogel and Dario Tommasini please see open letter at http://tinyurl.com/bde788x2 or file 'This To Apprise You About Wrong Doings By Brent Fogel including and not limited to plagiarism by Brent Fogel and Dario Tommasini.pdf' posted at https://gitlab.com/smukher2/openletter that I also emailed to UCLA, UC Berkeley, iScience and BMC Bioinformatics reporting plagiarism by Brent Fogel and Dario Tommasini in their two papers using this multiWGCNA code https://github.com/fogellab/multiWGCNA: Tommasini D, Fox R, Ngo KJ, Hinman JD, Fogel BL. Alterations in oligodendrocyte transcriptional networks reveal region-specific vulnerabilities to neurological disease. iScience. 2023 Mar 8;26(4):106358. doi: 10.1016/j.isci.2023.106358. PMID: 36994077; PMCID: PMC10040735. Tommasini D, Fogel BL. multiWGCNA: an R package for deep mining gene co-expression networks in multi-trait expression data. BMC Bioinformatics. 2023 Mar 24;24(1):115. doi: 10.1186/s12859-023-05233-z. PMID: 36964502; PMCID: PMC10039544.
*If you need WGCNA codes for different applications with video turorial consider using the original works (with video tutorials) by Dr. Steve Horvath, Dr. Peter Langfelder and Dr. Jeremy Miller: Langfelder, P., Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008). https://doi.org/10.1186/1471-2105-9-559 Miller JA, Horvath S, Geschwind DH. Divergence of human and mouse brain transcriptome highlights Alzheimer disease pathways. Proc Natl Acad Sci U S A. 2010 Jul 13;107(28):12698-703. Epub 2010 Jun 25. PMID: 20616000; PMCID: PMC2906579. https://doi.org/10.1073/pnas.0914257107
Video: Dr. Steve Horvath Weighted gene co-expression network analysis https://youtu.be/rRIRMW_RRS4?si=A-ZivIzwdRVLpaLa Video: Dr. Jeremy Miller How WGCNA Can be Used to Compare and Contrast Two Networks https://youtu.be/aBD67YmCBK4?si=eW9Ybv2nIWDUjkdT Full Playlist: WGCNA https://www.youtube.com/playlist?list=PLtlynCnS_vmB2kwhfkcfxIDbsSO9uniM5 Resources: Dr. Peter Langfelder lists further resources on his website https://peterlangfelder.com/2018/11/25/wgcna-resources-on-the-web/
Best regards, Shradha Mukherjee https://gitlab.com/smukher2 https://github.com/smukher2 https://orcid.org/0000-0002-3249-2551 https://pubmed.ncbi.nlm.nih.gov/?term=Shradha+Mukherjee
Hello,
I am trying to run through your vignette on autism brain samples and am unable to load the autism_se data. I get a error at
"autism_se = eh_query[["EH8219"]]".
The error message is:
"Error: EH8219 added after current Hub snapshot date. added: 2023-05-15 snapshote date: 2023-04-24"
How do you recommend getting around this issue? I would like to know how the data is supposed to be set up within the data.frame so that I may add my data.
We have expression data from plants in two genomes and under vertical or gravity stimulation.
Thank you,