Closed drropadope closed 10 months ago
Hi, thanks for your question. There had been an issue with metadata and AggregateExpression
in the initial Seurat 5.0.0
release, which was fixed in Seurat 5.0.1
. I would recommend upgrading Seurat and seeing if the issue is resolved now. I tried to reproduce this issue with the ifnb
data using Seurat 5.0.1
, and I get the expected behavior, e.g.:
> pseudo_ifnb <- AggregateExpression(ifnb, assays = "RNA", return.seurat = T, group.by = c("stim","seurat_annotations"))
> head(pseudo_ifnb)
orig.ident stim seurat_annotations
CTRL_CD14 Mono CTRL_CD14 Mono CTRL CD14 Mono
CTRL_CD4 Naive T CTRL_CD4 Naive T CTRL CD4 Naive T
CTRL_CD4 Memory T CTRL_CD4 Memory T CTRL CD4 Memory T
CTRL_CD16 Mono CTRL_CD16 Mono CTRL CD16 Mono
CTRL_B CTRL_B CTRL B
CTRL_CD8 T CTRL_CD8 T CTRL CD8 T
CTRL_T activated CTRL_T activated CTRL T activated
CTRL_NK CTRL_NK CTRL NK
CTRL_DC CTRL_DC CTRL DC
CTRL_B Activated CTRL_B Activated CTRL B Activated
I will close this issue for now, but please feel free to re-open if upgrading Seurat does not address this problem!
Upgrading did fix this issue. Thanks igrabski!
I still have the same issue:
packageVersion('Seurat')
[1] ‘5.1.0’
pseudo_sc.a2 <- AggregateExpression(sc.a2, assays = "RNA", return.seurat = T, group.by = c("celltype4","ttg2","replicate"))
Warning message: In PseudobulkExpression(object = object, pb.method = "aggregate", : Exponentiation yielded infinite values. data
may not be log-normed.
head(pseudo_sc.a2)
orig.ident nCount_RNA nFeature_RNA
Mesophyll_Mock_Col0_1 Mesophyll Inf 20859
Mesophyll_Mock_Col0_2 Mesophyll Inf 22513
Mesophyll_Mock_Col0_3 Mesophyll Inf 22200
Mesophyll_A2_3hpi_Col0_1 Mesophyll Inf 19651
Mesophyll_A2_3hpi_Col0_2 Mesophyll Inf 23995
Mesophyll_A2_5hpi_Col0_1 Mesophyll Inf 21676
Mesophyll_A2_5hpi_Col0_2 Mesophyll Inf 24309
Vascular_Mock_Col0_1 Vascular Inf 18589
Vascular_Mock_Col0_2 Vascular 8.31063e+264 18484
Vascular_Mock_Col0_3 Vascular Inf 20089
Hi, I'm unable to replicate this issue -- could you please show the full code you used to create and process this object, including what head(sc.a2)
looks like prior to running AggregateExpression
, and can you also show the output of sessionInfo()
?
hello thank you for your reply, and hope this info will help.
Hi, I'm unable to replicate this issue -- could you please show the full code you used to create and process this object, including what
head(sc.a2)
looks like prior to runningAggregateExpression
, and can you also show the output ofsessionInfo()
?
#sc.a2 is a subset from an integrated sc object
head(sc.a2)
Assays(sc.a2)
pseudo_sc.a2 <- AggregateExpression(sc.a2, assays = "RNA", return.seurat = T, group.by = c("celltype4","ttg2","replicate"))
head(pseudo_sc.a2)
sessionInfo()
head(sc.a2)
orig.ident nCount_RNA nFeature_RNA treatment time replicate genetype Mock_3hpi_1_AAACCCAAGTGCTCAT-1 Mock_3hpi_1 3291 1456 Mock 3hpi 1 Col0 Mock_3hpi_1_AAACCCAAGTTGGACG-1 Mock_3hpi_1 3147 1333 Mock 3hpi 1 Col0 Mock_3hpi_1_AAACCCACACCGTGAC-1 Mock_3hpi_1 3762 1628 Mock 3hpi 1 Col0 Mock_3hpi_1_AAACCCAGTCCCTAAA-1 Mock_3hpi_1 5569 1986 Mock 3hpi 1 Col0 Mock_3hpi_1_AAACCCATCCGGACGT-1 Mock_3hpi_1 3634 1480 Mock 3hpi 1 Col0 Mock_3hpi_1_AAACCCATCGTTCAGA-1 Mock_3hpi_1 3086 1443 Mock 3hpi 1 Col0 Mock_3hpi_1_AAACCCATCTAGCCAA-1 Mock_3hpi_1 4444 1776 Mock 3hpi 1 Col0 Mock_3hpi_1_AAACGAAAGAAGTGTT-1 Mock_3hpi_1 5661 2094 Mock 3hpi 1 Col0 Mock_3hpi_1_AAACGAAAGGAAGTGA-1 Mock_3hpi_1 5266 2053 Mock 3hpi 1 Col0 Mock_3hpi_1_AAACGAACAACACAGG-1 Mock_3hpi_1 3143 1463 Mock 3hpi 1 Col0 percent.mt percent.chl percent.rRNA nCount_SCT nFeature_SCT Mock_3hpi_1_AAACCCAAGTGCTCAT-1 0.7292616 3.524765 1.3065937 2967 1455 Mock_3hpi_1_AAACCCAAGTTGGACG-1 0.8261837 4.766444 1.8112488 2913 1333 Mock_3hpi_1_AAACCCACACCGTGAC-1 0.5316321 3.216374 1.1164274 3095 1618 Mock_3hpi_1_AAACCCAGTCCCTAAA-1 0.2334351 14.185671 0.5027833 3291 1730 Mock_3hpi_1_AAACCCATCCGGACGT-1 0.8805724 10.897083 1.5410017 3067 1474 Mock_3hpi_1_AAACCCATCGTTCAGA-1 0.2916397 3.985742 1.1665587 2897 1443 Mock_3hpi_1_AAACCCATCTAGCCAA-1 0.3825383 3.600360 0.9000900 3200 1721 Mock_3hpi_1_AAACGAAAGAAGTGTT-1 0.6006006 9.644939 1.0422187 3344 1819 Mock_3hpi_1_AAACGAAAGGAAGTGA-1 0.3608052 2.392708 0.7026206 3285 1864 Mock_3hpi_1_AAACGAACAACACAGG-1 0.2545339 3.372574 0.8908686 2911 1463 log10GenesPerUMI treat_time treat_time2 genotype ttg2 Mock_3hpi_1_AAACCCAAGTGCTCAT-1 0.8993081 Mock_3hpi Mock Col0 Mock_Col0 Mock_3hpi_1_AAACCCAAGTTGGACG-1 0.8933455 Mock_3hpi Mock Col0 Mock_Col0 Mock_3hpi_1_AAACCCACACCGTGAC-1 0.8982596 Mock_3hpi Mock Col0 Mock_Col0 Mock_3hpi_1_AAACCCAGTCCCTAAA-1 0.8804526 Mock_3hpi Mock Col0 Mock_Col0 Mock_3hpi_1_AAACCCATCCGGACGT-1 0.8904267 Mock_3hpi Mock Col0 Mock_Col0 Mock_3hpi_1_AAACCCATCGTTCAGA-1 0.9053906 Mock_3hpi Mock Col0 Mock_Col0 Mock_3hpi_1_AAACCCATCTAGCCAA-1 0.8908016 Mock_3hpi Mock Col0 Mock_Col0 Mock_3hpi_1_AAACGAAAGAAGTGTT-1 0.8849111 Mock_3hpi Mock Col0 Mock_Col0 Mock_3hpi_1_AAACGAAAGGAAGTGA-1 0.8900728 Mock_3hpi Mock Col0 Mock_Col0 Mock_3hpi_1_AAACGAACAACACAGG-1 0.9050422 Mock_3hpi Mock Col0 Mock_Col0 integrated_snn_res.0.8 seurat_clusters ttg celltype1 Mock_3hpi_1_AAACCCAAGTGCTCAT-1 25 25 Mock_3hpi_Col0 Companion cells Mock_3hpi_1_AAACCCAAGTTGGACG-1 1 1 Mock_3hpi_Col0 Mesophyll_2 Mock_3hpi_1_AAACCCACACCGTGAC-1 14 14 Mock_3hpi_Col0 Mesophyll_14 Mock_3hpi_1_AAACCCAGTCCCTAAA-1 6 6 Mock_3hpi_Col0 Mesophyll_7 Mock_3hpi_1_AAACCCATCCGGACGT-1 12 12 Mock_3hpi_Col0 Mesophyll_13 Mock_3hpi_1_AAACCCATCGTTCAGA-1 14 14 Mock_3hpi_Col0 Mesophyll_14 Mock_3hpi_1_AAACCCATCTAGCCAA-1 3 3 Mock_3hpi_Col0 Mesophyll_4 Mock_3hpi_1_AAACGAAAGAAGTGTT-1 11 11 Mock_3hpi_Col0 Mesophyll_12 Mock_3hpi_1_AAACGAAAGGAAGTGA-1 7 7 Mock_3hpi_Col0 Mesophyll_8 Mock_3hpi_1_AAACGAACAACACAGG-1 9 9 Mock_3hpi_Col0 Mesophyll_10 celltype2 celltype3 celltype4 celltype1.ttg2 Mock_3hpi_1_AAACCCAAGTGCTCAT-1 Companion cells Companion cells Vascular Companion cells_Mock_Col0 Mock_3hpi_1_AAACCCAAGTTGGACG-1 Mesophyll Mesophyll Mesophyll Mesophyll_2_Mock_Col0 Mock_3hpi_1_AAACCCACACCGTGAC-1 Mesophyll Mesophyll Mesophyll Mesophyll_14_Mock_Col0 Mock_3hpi_1_AAACCCAGTCCCTAAA-1 Mesophyll Mesophyll Mesophyll Mesophyll_7_Mock_Col0 Mock_3hpi_1_AAACCCATCCGGACGT-1 Mesophyll Mesophyll Mesophyll Mesophyll_13_Mock_Col0 Mock_3hpi_1_AAACCCATCGTTCAGA-1 Mesophyll Mesophyll Mesophyll Mesophyll_14_Mock_Col0 Mock_3hpi_1_AAACCCATCTAGCCAA-1 Mesophyll Mesophyll Mesophyll Mesophyll_4_Mock_Col0 Mock_3hpi_1_AAACGAAAGAAGTGTT-1 Mesophyll Mesophyll Mesophyll Mesophyll_12_Mock_Col0 Mock_3hpi_1_AAACGAAAGGAAGTGA-1 Mesophyll Mesophyll Mesophyll Mesophyll_8_Mock_Col0 Mock_3hpi_1_AAACGAACAACACAGG-1 Mesophyll Mesophyll Mesophyll Mesophyll_10_Mock_Col0 celltype1a celltype4.ttg2 ttg2_rep celltype4.ttg2.rep Mock_3hpi_1_AAACCCAAGTGCTCAT-1 Companion-cells Vascular_Mock_Col0 Mock_Col0_1 Vascular_Mock_Col0_1 Mock_3hpi_1_AAACCCAAGTTGGACG-1 Mesophyll-2 Mesophyll_Mock_Col0 Mock_Col0_1 Mesophyll_Mock_Col0_1 Mock_3hpi_1_AAACCCACACCGTGAC-1 Mesophyll-14 Mesophyll_Mock_Col0 Mock_Col0_1 Mesophyll_Mock_Col0_1 Mock_3hpi_1_AAACCCAGTCCCTAAA-1 Mesophyll-7 Mesophyll_Mock_Col0 Mock_Col0_1 Mesophyll_Mock_Col0_1 Mock_3hpi_1_AAACCCATCCGGACGT-1 Mesophyll-13 Mesophyll_Mock_Col0 Mock_Col0_1 Mesophyll_Mock_Col0_1 Mock_3hpi_1_AAACCCATCGTTCAGA-1 Mesophyll-14 Mesophyll_Mock_Col0 Mock_Col0_1 Mesophyll_Mock_Col0_1 Mock_3hpi_1_AAACCCATCTAGCCAA-1 Mesophyll-4 Mesophyll_Mock_Col0 Mock_Col0_1 Mesophyll_Mock_Col0_1 Mock_3hpi_1_AAACGAAAGAAGTGTT-1 Mesophyll-12 Mesophyll_Mock_Col0 Mock_Col0_1 Mesophyll_Mock_Col0_1 Mock_3hpi_1_AAACGAAAGGAAGTGA-1 Mesophyll-8 Mesophyll_Mock_Col0 Mock_Col0_1 Mesophyll_Mock_Col0_1 Mock_3hpi_1_AAACGAACAACACAGG-1 Mesophyll-10 Mesophyll_Mock_Col0 Mock_Col0_1 Mesophyll_Mock_Col0_1
> pseudo_sc.a2 <- AggregateExpression(sc.a2, assays = "RNA", return.seurat = T, group.by = c("celltype4","ttg2","replicate"))
Warning message: In PseudobulkExpression(object = object, pb.method = "aggregate", : Exponentiation yielded infinite values. `data` may not be log-normed.
> head(pseudo_sc.a2)
orig.ident nCount_RNA nFeature_RNA
Mesophyll_Mock_Col0_1 Mesophyll Inf 20859
Mesophyll_Mock_Col0_2 Mesophyll Inf 22513
Mesophyll_Mock_Col0_3 Mesophyll Inf 22200
Mesophyll_A2_3hpi_Col0_1 Mesophyll Inf 19651
Mesophyll_A2_3hpi_Col0_2 Mesophyll Inf 23995
Mesophyll_A2_5hpi_Col0_1 Mesophyll Inf 21676
Mesophyll_A2_5hpi_Col0_2 Mesophyll Inf 24309
Vascular_Mock_Col0_1 Vascular Inf 18589
Vascular_Mock_Col0_2 Vascular 8.31063e+264 18484
Vascular_Mock_Col0_3 Vascular Inf 20089
> sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] magick_2.8.4 readxl_1.4.3 clusterProfiler_4.12.2
[4] UpSetR_1.4.0 Matrix.utils_0.9.7 data.table_1.15.4
[7] RColorBrewer_1.1-3 DEGreport_1.40.1 DESeq2_1.44.0
[10] apeglm_1.26.1 png_0.1-8 pheatmap_1.0.12
[13] reshape2_1.4.4 edgeR_4.2.1 limma_3.60.4
[16] colorspace_2.1-1 harmony_1.2.0 Rcpp_1.0.13
[19] org.At.tair.db_3.19.1 AnnotationDbi_1.66.0 viridis_0.6.5
[22] viridisLite_0.4.2 scater_1.32.1 scuttle_1.14.0
[25] dittoSeq_1.16.0 scCustomize_2.1.2 cowplot_1.1.3
[28] scales_1.3.0 Matrix_1.7-0 monocle3_1.3.7
[31] SingleCellExperiment_1.26.0 SummarizedExperiment_1.34.0 GenomicRanges_1.56.1
[34] GenomeInfoDb_1.40.1 IRanges_2.38.1 S4Vectors_0.42.1
[37] MatrixGenerics_1.16.0 matrixStats_1.3.0 Biobase_2.64.0
[40] BiocGenerics_0.50.0 lubridate_1.9.3 forcats_1.0.0
[43] stringr_1.5.1 purrr_1.0.2 readr_2.1.5
[46] tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1
[49] tidyverse_2.0.0 patchwork_1.2.0 SeuratWrappers_0.2.0
[52] R.utils_2.12.3 R.oo_1.26.0 R.methodsS3_1.8.2
[55] dplyr_1.1.4 SeuratObject_5.0.2 Seurat_5.1.0
[58] remotes_2.5.0
loaded via a namespace (and not attached):
[1] vroom_1.6.5 urlchecker_1.0.1 goftest_1.2-3
[4] Biostrings_2.72.1 vctrs_0.6.5 spatstat.random_3.3-1
[7] digest_0.6.36 shape_1.4.6.1 ggrepel_0.9.5
[10] deldir_2.0-4 parallelly_1.38.0 MASS_7.3-60.2
[13] reshape_0.8.9 qvalue_2.36.0 httpuv_1.6.15
[16] foreach_1.5.2 withr_3.0.0 ggrastr_1.0.2
[19] ggfun_0.1.5 psych_2.4.6.26 xfun_0.46
[22] ellipsis_0.3.2 survival_3.5-8 memoise_2.0.1
[25] ggbeeswarm_0.7.2 gson_0.1.0 janitor_2.2.0
[28] profvis_0.3.8 systemfonts_1.1.0 tidytree_0.4.6
[31] ragg_1.3.2 zoo_1.8-12 GlobalOptions_0.1.2
[34] pbapply_1.7-2 logging_0.10-108 rematch2_2.1.2
[37] KEGGREST_1.44.1 promises_1.3.0 httr_1.4.7
[40] globals_0.16.3 fitdistrplus_1.2-1 ps_1.7.7
[43] rstudioapi_0.16.0 UCSC.utils_1.0.0 miniUI_0.1.1.1
[46] generics_0.1.3 DOSE_3.30.2 processx_3.8.4
[49] curl_5.2.1 zlibbioc_1.50.0 ggraph_2.2.1
[52] ScaledMatrix_1.12.0 polyclip_1.10-7 GenomeInfoDbData_1.2.12
[55] SparseArray_1.4.8 xtable_1.8-4 desc_1.4.3
[58] doParallel_1.0.17 evaluate_0.24.0 S4Arrays_1.4.1
[61] hms_1.1.3 irlba_2.3.5.1 ROCR_1.0-11
[64] reticulate_1.38.0 spatstat.data_3.1-2 magrittr_2.0.3
[67] lmtest_0.9-40 snakecase_0.11.1 ggtree_3.12.0
[70] later_1.3.2 lattice_0.22-6 spatstat.geom_3.3-2
[73] future.apply_1.11.2 shadowtext_0.1.4 scattermore_1.2
[76] RcppAnnoy_0.0.22 pillar_1.9.0 nlme_3.1-164
[79] iterators_1.0.14 compiler_4.4.0 beachmat_2.20.0
[82] stringi_1.8.4 tensor_1.5 minqa_1.2.7
[85] devtools_2.4.5 plyr_1.8.9 crayon_1.5.3
[88] abind_1.4-5 ggdendro_0.2.0 gridGraphics_0.5-1
[91] emdbook_1.3.13 locfit_1.5-9.10 sp_2.1-4
[94] graphlayouts_1.1.1 bit_4.0.5 fastmatch_1.1-4
[97] codetools_0.2-20 textshaping_0.4.0 BiocSingular_1.20.0
[100] paletteer_1.6.0 GetoptLong_1.0.5 plotly_4.10.4
[103] mime_0.12 splines_4.4.0 circlize_0.4.16
[106] sparseMatrixStats_1.16.0 HDO.db_0.99.1 cellranger_1.1.0
[109] grr_0.9.5 knitr_1.48 blob_1.2.4
[112] utf8_1.2.4 clue_0.3-65 lme4_1.1-35.5
[115] fs_1.6.4 listenv_0.9.1 DelayedMatrixStats_1.26.0
[118] pkgbuild_1.4.4 ggplotify_0.1.2 callr_3.7.6
[121] statmod_1.5.0 tzdb_0.4.0 tweenr_2.0.3
[124] pkgconfig_2.0.3 tools_4.4.0 cachem_1.1.0
[127] RSQLite_2.3.7 DBI_1.2.3 numDeriv_2016.8-1.1
[130] fastmap_1.2.0 grid_4.4.0 usethis_3.0.0
[133] ica_1.0-3 broom_1.0.6 coda_0.19-4.1
[136] ggprism_1.0.5 BiocManager_1.30.23 dotCall64_1.1-1
[139] RANN_2.6.1 farver_2.1.2 scatterpie_0.2.3
[142] tidygraph_1.3.1 mgcv_1.9-1 cli_3.6.3
[145] leiden_0.4.3.1 lifecycle_1.0.4 uwot_0.2.2
[148] mvtnorm_1.2-5 sessioninfo_1.2.2 backports_1.5.0
[151] BiocParallel_1.38.0 timechange_0.3.0 gtable_0.3.5
[154] rjson_0.2.21 ggridges_0.5.6 progressr_0.14.0
[157] ape_5.8 parallel_4.4.0 jsonlite_1.8.8
[160] bit64_4.0.5 Rtsne_0.17 yulab.utils_0.1.5
[163] spatstat.utils_3.0-5 BiocNeighbors_1.22.0 bdsmatrix_1.3-7
[166] highr_0.11 GOSemSim_2.30.0 spatstat.univar_3.0-0
[169] lazyeval_0.2.2 shiny_1.9.0 ConsensusClusterPlus_1.68.0
[172] enrichplot_1.24.2 htmltools_0.5.8.1 GO.db_3.19.1
[175] sctransform_0.4.1 glue_1.7.0 spam_2.10-0
[178] XVector_0.44.0 treeio_1.28.0 mnormt_2.1.1
[181] gridExtra_2.3 boot_1.3-30 igraph_2.0.3
[184] R6_2.5.1 labeling_0.4.3 cluster_2.1.6
[187] bbmle_1.0.25.1 pkgload_1.4.0 aplot_0.2.3
[190] nloptr_2.1.1 DelayedArray_0.30.1 tidyselect_1.2.1
[193] vipor_0.4.7 ggforce_0.4.2 future_1.34.0
[196] rsvd_1.0.5 munsell_0.5.1 KernSmooth_2.23-22
[199] fgsea_1.30.0 htmlwidgets_1.6.4 ComplexHeatmap_2.20.0
[202] rlang_1.1.4 spatstat.sparse_3.1-0 spatstat.explore_3.3-1
[205] fansi_1.0.6 beeswarm_0.4.0
`
I am trying to use AggregateExpression to do DE analysis on pseudobulked data. However, I am having a lot of difficulty getting the output I (think) I should be getting. The meta data objects I use for the group.by argument clearly exist in the object I give (see tables generated in code below), and the AggregateExpression also seems to be finding that meta data because the names of the groups are what I would expect them to be, but for some reason it isn't actually grouping the counts the way it should be- it only creates one column in the meta data matrix instead of the three I think it should be.
In the code below, "cb" is a seurat object. It contains two biological replicates (stored in orig.ident) run as separate libraries. For each biological replicate, there are also 5 conditions that were multiplexed using hashtag labelling. The cb object has been integrated using Harmony and the 5 conditions have been demultiplexed using HTODemux, so the condition labels are stored in HTO_classification in meta data.
I'm sure I'm missing something, but I can't figure out what could be wrong about my input. Any help would be greatly appreciated!
As a quick addendum, I just tried running this command using the Seurat vignette with the ifnb data, and I get the same issue. So it doesn't seem to be specific to my object.