smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
316 stars 31 forks source link

Error running SCTransform #55

Closed ChristinaSteyn closed 1 year ago

ChristinaSteyn commented 1 year ago

Hi Sam, I have run into a different error trying to run the SCTransform pipeline (Option 1). The error I am getting is: Error: 'SCT' is not an assay Execution halted

It is coming up at the SetDatExpr() step and the error persists even when I change the default assay, specify a different assay, or don't specify any assay. I will happily provide more information and can send you the code I ran too but it is very similar to what is in the tutorial other than that I ran SCT on each sample separately as follows but I don't think this should make a difference:

split_seurat <- SplitObject(filtered_seurat, split.by = "sample")

split_seurat <- split_seurat[c("A", "B", "C", "D")]

for (i in 1:length(split_seurat)) { split_seurat[[i]] <- NormalizeData(split_seurat[[i]], verbose = TRUE) split_seurat[[i]] <- CellCycleScoring(split_seurat[[i]], g2m.features=g2m_genes, s.features=s_genes) split_seurat[[i]] <- SCTransform(split_seurat[[i]], vars.to.regress = c("mitoRatio"), variable.features.n = 3000) }

Do you perhaps have any advice?

smorabit commented 1 year ago

Hi,

The error message is indicating that you don't have an assay called SCT in your Seurat object. The error is coming from the Seurat function GetAssayData. For example, you'll get the same message if you try this:

GetAssayData(seurat_obj, assay = 'Definitely not a real assay')
Error: 'Definitely not a real assay' is not an assay

You can see which assays are in your Seurat object by running print(seurat_obj)

ChristinaSteyn commented 1 year ago

Hi there,

I have now tried running option 2 in the SCT tutorial:

When I look at the object I get this output: An object of class Seurat 65261 features across 164637 samples within 3 assays Active assay: SCT (30224 features, 0 variable features) 2 other assays present: RNA, integrated 2 dimensional reductions calculated: pca, umap

So as far as I can tell the SCT assay is there and when I run GetAssayData(seu, assay = 'SCT') it gives me a matrix. However, at the SetDatExpr step _[seurat_obj <- SetDatExpr(seurat_obj, assay = 'SCT', slot='data', groupname = 'Inh', group.by='clusters')], I am now getting the following error "Error in WGCNA::goodGenes(datExpr, ...) : unused argument (assay = "SCT") Calls: SetDatExpr Execution halted " I'm not sure if this is because of running SCT on multiple different samples individually and then merging. I will try running it on one sample and let you know if the problem persists but if you have any advice that would be great!

smorabit commented 1 year ago

I tried to reproduce your error on the tutorial data but it seems to run just fine, even when running SCTransform separately on each sample. Here's the code that I ran.


# starting from the Zhou et al tutorial dataset
seurat_test <- subset(seurat_obj, cell_type == 'ASC')

# splitting into one seurat object per sample
split_seurat <- SplitObject(seurat_test, split.by = "Sample")

# just take 4 samples so it runs faster
split_seurat <- split_seurat[1:4]

# loop through the list of seurat objects to run SCTransform
for (i in 1:length(split_seurat)) {
split_seurat[[i]] <- NormalizeData(split_seurat[[i]], verbose = TRUE)
#split_seurat[[i]] <- CellCycleScoring(split_seurat[[i]], g2m.features=g2m_genes, s.features=s_genes)
split_seurat[[i]] <- SCTransform(split_seurat[[i]], vars.to.regress = c("nCount_RNA"), variable.features.n = 3000)
}

# merge back into one seurat object
seurat_merged <- merge(split_seurat[[1]], split_seurat[2:4])

# note: I am not sure what the best way to find variable features is when you run SCT like this so I am just taking them from the RNA assay
seurat_merged <- FindVariableFeatures(seurat_merged, assay='RNA')
VariableFeatures(seurat_merged) <- VariableFeatures(seurat_merged, assay='RNA')
seurat_merged <- RunPCA(seurat_merged)

seurat_merged <- SetupForWGCNA(
  seurat_merged,
  features = rownames(seurat_merged), # using the genes present in the SCT assay
  wgcna_name = "SCT"
)

seurat_merged <- MetacellsByGroups(
  seurat_obj = seurat_merged,
  group.by = c("Sample"),
  k = 25,
  max_shared=12,
  min_cells = 50,
  reduction = 'pca',
  ident.group = 'Sample',
  slot = 'counts',
  assay = 'SCT'
)

# set expression matrix for hdWGCNA
seurat_merged <- SetDatExpr(seurat_merged)

# test different soft power thresholds
seurat_merged <- TestSoftPowers(seurat_merged)
plot_list <- PlotSoftPowers(seurat_merged)

seurat_merged <- ConstructNetwork(
  seurat_merged,
  tom_name = "SCT_issue",
  overwrite_tom = TRUE
)

# compute module eigengenes and connectivity
seurat_merged <- ModuleEigengenes(seurat_merged)
seurat_merged <- ModuleConnectivity(seurat_merged)

Wondering if this is specific to your dataset or can you reproduce the issue on the tutorial data?

ChristinaSteyn commented 1 year ago

Thanks for trying this out for me. I have tried to run the exact same code you have put above and I'm encountering a few errors. The first is at the MetacellsByGroups() function where I get the following error: Error in MetacellsByGroups(seurat_obj = seurat_merged, group.by = c("Sample"), : unused arguments (max_shared = 12, min_cells = 50). If I comment out the max_shared and min_cells the code runs but I then get an error at the SetDatExpr() function which says Error in if (!(group.by %in% colnames(s_obj@meta.data))) { : argument is of length zero. To get round this I specified group.by='Sample' and then got a different error in the same function: Error in get(group.by) %in% group_name : argument "group_name" is missing, with no default . I then tried saying group_name = 'ASC' which resulted in the same error that I get when trying to run this on my data: Error: 'SCT' is not an assay.

So it seems this error is not specific to my data as I am reproducing it with the tutorial data. Really not sure what the issue is. I checked the packages and I have loaded all the libraries in the tutorial.

smorabit commented 1 year ago

Could you please run sessionInfo() and show me the output?

ChristinaSteyn commented 1 year ago

Hi Sam, this is the output of sessionInfo():

sessionInfo() R version 4.2.0 (2022-04-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale: [1] LC_CTYPE=en_ZA.utf-8 LC_NUMERIC=C [3] LC_TIME=en_ZA.utf-8 LC_COLLATE=en_ZA.utf-8 [5] LC_MONETARY=en_ZA.utf-8 LC_MESSAGES=en_ZA.utf-8 [7] LC_PAPER=en_ZA.utf-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_ZA.utf-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] hdWGCNA_0.1.1.9001 WGCNA_1.71 fastcluster_1.2.3 [4] dynamicTreeCut_1.63-1 patchwork_1.1.1 cowplot_1.1.1 [7] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.9 [10] purrr_0.3.4 readr_2.1.2 tidyr_1.2.0 [13] tibble_3.1.7 ggplot2_3.3.6 tidyverse_1.3.1 [16] sp_1.4-7 SeuratObject_4.1.0 Seurat_4.1.1

loaded via a namespace (and not attached): [1] readxl_1.4.0 backports_1.4.1 Hmisc_4.7-0 [4] plyr_1.8.7 igraph_1.3.1 lazyeval_0.2.2 [7] splines_4.2.0 listenv_0.8.0 scattermore_0.8 [10] GenomeInfoDb_1.32.2 digest_0.6.29 foreach_1.5.2 [13] htmltools_0.5.3 GO.db_3.15.0 fansi_1.0.3 [16] checkmate_2.1.0 memoise_2.0.1 magrittr_2.0.3 [19] tensor_1.5 cluster_2.1.3 doParallel_1.0.17 [22] ROCR_1.0-11 tzdb_0.3.0 Biostrings_2.64.0 [25] globals_0.15.0 modelr_0.1.8 matrixStats_0.62.0 [28] spatstat.sparse_2.1-1 jpeg_0.1-9 colorspace_2.0-3 [31] blob_1.2.3 rvest_1.0.2 ggrepel_0.9.1 [34] xfun_0.31 haven_2.5.0 RCurl_1.98-1.6 [37] crayon_1.5.1 jsonlite_1.8.0 progressr_0.10.0 [40] spatstat.data_2.2-0 impute_1.70.0 survival_3.3-1 [43] zoo_1.8-10 iterators_1.0.14 glue_1.6.2 [46] polyclip_1.10-0 gtable_0.3.0 zlibbioc_1.42.0 [49] XVector_0.36.0 leiden_0.4.2 future.apply_1.9.0 [52] BiocGenerics_0.42.0 abind_1.4-5 scales_1.2.0 [55] DBI_1.1.2 spatstat.random_2.2-0 miniUI_0.1.1.1 [58] Rcpp_1.0.8.3 htmlTable_2.4.0 viridisLite_0.4.0 [61] xtable_1.8-4 reticulate_1.25 spatstat.core_2.4-2 [64] foreign_0.8-82 bit_4.0.4 preprocessCore_1.58.0 [67] Formula_1.2-4 stats4_4.2.0 htmlwidgets_1.5.4 [70] httr_1.4.3 FNN_1.1.3 RColorBrewer_1.1-3 [73] ellipsis_0.3.2 ica_1.0-2 pkgconfig_2.0.3 [76] nnet_7.3-17 uwot_0.1.11 dbplyr_2.1.1 [79] deldir_1.0-6 utf8_1.2.2 tidyselect_1.1.2 [82] rlang_1.0.4 reshape2_1.4.4 later_1.3.0 [85] AnnotationDbi_1.58.0 cachem_1.0.6 munsell_0.5.0 [88] cellranger_1.1.0 tools_4.2.0 cli_3.3.0 [91] generics_0.1.2 RSQLite_2.2.14 broom_0.8.0 [94] ggridges_0.5.3 fastmap_1.1.0 goftest_1.2-3 [97] knitr_1.39 bit64_4.0.5 fs_1.5.2 [100] fitdistrplus_1.1-8 RANN_2.6.1 KEGGREST_1.36.0 [103] pbapply_1.5-0 future_1.25.0 nlme_3.1-157 [106] mime_0.12 xml2_1.3.3 compiler_4.2.0 [109] rstudioapi_0.13 plotly_4.10.0 png_0.1-7 [112] spatstat.utils_2.3-1 reprex_2.0.1 stringi_1.7.8 [115] rgeos_0.5-9 lattice_0.20-45 Matrix_1.4-1 [118] vctrs_0.4.1 pillar_1.7.0 lifecycle_1.0.1 [121] spatstat.geom_2.4-0 lmtest_0.9-40 RcppAnnoy_0.0.19 [124] bitops_1.0-7 data.table_1.14.2 irlba_2.3.5 [127] httpuv_1.6.5 latticeExtra_0.6-29 R6_2.5.1 [130] promises_1.2.0.1 KernSmooth_2.23-20 gridExtra_2.3 [133] IRanges_2.30.0 parallelly_1.31.1 codetools_0.2-18 [136] MASS_7.3-57 assertthat_0.2.1 withr_2.5.0 [139] sctransform_0.3.3 GenomeInfoDbData_1.2.8 S4Vectors_0.34.0 [142] mgcv_1.8-40 parallel_4.2.0 hms_1.1.1 [145] grid_4.2.0 rpart_4.1.16 Rtsne_0.16 [148] base64enc_0.1-3 Biobase_2.56.0 shiny_1.7.1 [151] lubridate_1.8.0

smorabit commented 1 year ago

Could you try updating hdWGCNA to the newest version and trying to run your code again? There have been many updates since the version that you were running (0.1.1.9001).

ChristinaSteyn commented 1 year ago

Oh my word, I can't believe I didn't realise that was the problem - thank you so much!! This has finally worked. One final question, for now, do you perhaps have any advice regarding the difference between using Option 1 or Option 2 from the SCT tutorial https://smorabit.github.io/hdWGCNA/articles/sctransform.html? Thanks again for all your help.

ChristinaSteyn commented 1 year ago

Sorry also wanted to ask what you think about subsetting a particular cluster before running SCTransform or only doing this at the SetDatExpr() function?

smorabit commented 1 year ago

Glad that your issue was solved! Unfortunately I don't have advice for whether option 1 or 2 would be better for your dataset, or whether or not you should subset. Worth trying multiple options and comparing the results.