SingleR-inc / SingleR

Clone of the Bioconductor repository for the SingleR package.
https://bioconductor.org/packages/devel/bioc/html/SingleR.html
GNU General Public License v3.0
176 stars 19 forks source link

SingleR() metadata is mangled when using multiple references #122

Closed PeteHaitch closed 4 years ago

PeteHaitch commented 4 years ago

There's a change of behaviour between BioC 3.10 and 3.11, but neither seems correct.

Desired behaviour

  1. Retain the metadata when using multiple references
  2. Retain the names so users can do things like metadata(pred_list)$de.genes[["MI.Basophils"]] (#121 is related, I think).

BioC 3.10

suppressPackageStartupMessages(library(SingleR))
test <- HumanPrimaryCellAtlasData()
#> snapshotDate(): 2019-10-22
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
ref <- list(BE = BlueprintEncodeData(), MI = MonacoImmuneData())
#> snapshotDate(): 2019-10-22
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
#> snapshotDate(): 2019-10-22
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache

labels_main <- split(
  paste0(
    rep(names(ref), sapply(ref, ncol)),
    ".",
    unlist(lapply(ref, function(x) x$label.main), use.names = FALSE)),
  factor(rep(names(ref), sapply(ref, ncol)), names(ref)))

# Single ref
pred1 <- SingleR(
  test = test,
  ref = ref[[1]],
  labels = labels_main[[1]])
# metadata is present
str(metadata(pred1), 1)
#> List of 2
#>  $ common.genes: chr [1:3323] "FCGR3B" "CXCR1" "CXCR2" "MME" ...
#>  $ de.genes    :List of 24
# And de.genes metdata has useful names
str(metadata(pred1)$de.genes, 1)
#> List of 24
#>  $ BE.Neutrophils      :List of 24
#>  $ BE.Monocytes        :List of 24
#>  $ BE.HSC              :List of 24
#>  $ BE.CD4+ T-cells     :List of 24
#>  $ BE.CD8+ T-cells     :List of 24
#>  $ BE.NK cells         :List of 24
#>  $ BE.B-cells          :List of 24
#>  $ BE.Macrophages      :List of 24
#>  $ BE.Erythrocytes     :List of 24
#>  $ BE.Endothelial cells:List of 24
#>  $ BE.DC               :List of 24
#>  $ BE.Eosinophils      :List of 24
#>  $ BE.Chondrocytes     :List of 24
#>  $ BE.Fibroblasts      :List of 24
#>  $ BE.Smooth muscle    :List of 24
#>  $ BE.Epithelial cells :List of 24
#>  $ BE.Melanocytes      :List of 24
#>  $ BE.Skeletal muscle  :List of 24
#>  $ BE.Keratinocytes    :List of 24
#>  $ BE.Myocytes         :List of 24
#>  $ BE.Adipocytes       :List of 24
#>  $ BE.Neurons          :List of 24
#>  $ BE.Pericytes        :List of 24
#>  $ BE.Mesangial cells  :List of 24

# Multiple refs
pred_list <- SingleR(
  test = test,
  ref = ref,
  labels = labels_main)
# metadata is present
str(metadata(pred_list), 1)
#> List of 2
#>  $ common.genes: chr [1:3949] "FCGR3B" "CXCR1" "CXCR2" "MME" ...
#>  $ de.genes    :List of 34
# but de.genes metadata doesn't have useful names
str(metadata(pred_list)$de.genes, 1)
#> List of 34
#>  $ c("BE.Neutrophils", "BE.Monocytes", "BE.HSC", "BE.CD4+ T-cells", "BE.CD8+ T-cells", "BE.NK cells", "BE.B-cells", "BE.Macrophages", "BE.Erythrocytes", "BE.Endothelial cells", "BE.DC", "BE.Eosinophils", "BE.Chondrocytes", "BE.Fibroblasts", "BE.Smooth muscle", "BE.Epithelial cells", "BE.Melanocytes", "BE.Skeletal muscle", "BE.Keratinocytes", "BE.Myocytes", "BE.Adipocytes", "BE.Neurons", "BE.Pericytes", "BE.Mesangial cells"):List of 24
#>  $ c("MI.CD8+ T cells", "MI.T cells", "MI.CD4+ T cells", "MI.Progenitors", "MI.B cells", "MI.Monocytes", "MI.NK cells", "MI.Dendritic cells", "MI.Neutrophils", "MI.Basophils")                                                                                                                                                                                                                                                            :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 24
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
#>  $ NA                                                                                                                                                                                                                                                                                                                                                                                                                                      :List of 10
# Can add them back
names(metadata(pred_list)$de.genes) <- unique(unlist(labels_main))
str(metadata(pred_list)$de.genes, 1)
#> List of 34
#>  $ BE.Neutrophils      :List of 24
#>  $ BE.Monocytes        :List of 24
#>  $ BE.HSC              :List of 24
#>  $ BE.CD4+ T-cells     :List of 24
#>  $ BE.CD8+ T-cells     :List of 24
#>  $ BE.NK cells         :List of 24
#>  $ BE.B-cells          :List of 24
#>  $ BE.Macrophages      :List of 24
#>  $ BE.Erythrocytes     :List of 24
#>  $ BE.Endothelial cells:List of 24
#>  $ BE.DC               :List of 24
#>  $ BE.Eosinophils      :List of 24
#>  $ BE.Chondrocytes     :List of 24
#>  $ BE.Fibroblasts      :List of 24
#>  $ BE.Smooth muscle    :List of 24
#>  $ BE.Epithelial cells :List of 24
#>  $ BE.Melanocytes      :List of 24
#>  $ BE.Skeletal muscle  :List of 24
#>  $ BE.Keratinocytes    :List of 24
#>  $ BE.Myocytes         :List of 24
#>  $ BE.Adipocytes       :List of 24
#>  $ BE.Neurons          :List of 24
#>  $ BE.Pericytes        :List of 24
#>  $ BE.Mesangial cells  :List of 24
#>  $ MI.CD8+ T cells     :List of 10
#>  $ MI.T cells          :List of 10
#>  $ MI.CD4+ T cells     :List of 10
#>  $ MI.Progenitors      :List of 10
#>  $ MI.B cells          :List of 10
#>  $ MI.Monocytes        :List of 10
#>  $ MI.NK cells         :List of 10
#>  $ MI.Dendritic cells  :List of 10
#>  $ MI.Neutrophils      :List of 10
#>  $ MI.Basophils        :List of 10

Created on 2020-05-15 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 3.6.2 (2019-12-12) #> os CentOS Linux 7 (Core) #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Australia/Melbourne #> date 2020-05-15 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> AnnotationDbi 1.48.0 2019-10-29 [1] Bioconductor #> AnnotationHub 2.18.0 2019-10-29 [1] Bioconductor #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1) #> backports 1.1.7 2020-05-13 [1] CRAN (R 3.6.2) #> Biobase * 2.46.0 2019-10-29 [1] Bioconductor #> BiocFileCache 1.10.2 2019-11-08 [1] Bioconductor #> BiocGenerics * 0.32.0 2019-10-29 [1] Bioconductor #> BiocManager 1.30.10 2019-11-16 [1] CRAN (R 3.6.1) #> BiocNeighbors 1.4.2 2020-02-29 [1] Bioconductor #> BiocParallel * 1.20.1 2019-12-21 [1] Bioconductor #> BiocVersion 3.10.1 2019-06-06 [1] Bioconductor #> bit 1.1-15.2 2020-02-10 [1] CRAN (R 3.6.2) #> bit64 0.9-7 2017-05-08 [1] CRAN (R 3.6.1) #> bitops 1.0-6 2013-08-17 [1] CRAN (R 3.6.1) #> blob 1.2.1 2020-01-20 [1] CRAN (R 3.6.2) #> callr 3.4.3 2020-03-28 [1] CRAN (R 3.6.2) #> cli 2.0.2 2020-02-28 [1] CRAN (R 3.6.2) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1) #> curl 4.3 2019-12-02 [1] CRAN (R 3.6.1) #> DBI 1.1.0 2019-12-15 [1] CRAN (R 3.6.2) #> dbplyr 1.4.3 2020-04-19 [1] CRAN (R 3.6.2) #> DelayedArray * 0.12.3 2020-04-09 [1] Bioconductor #> DelayedMatrixStats 1.8.0 2019-10-29 [1] Bioconductor #> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1) #> devtools 2.3.0 2020-04-10 [1] CRAN (R 3.6.2) #> digest 0.6.25 2020-02-23 [1] CRAN (R 3.6.2) #> dplyr 0.8.5 2020-03-07 [1] CRAN (R 3.6.2) #> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.1) #> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1) #> ExperimentHub 1.12.0 2019-10-29 [1] Bioconductor #> fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.2) #> fastmap 1.0.1 2019-10-08 [1] CRAN (R 3.6.1) #> fs 1.4.1 2020-04-04 [1] CRAN (R 3.6.2) #> GenomeInfoDb * 1.22.1 2020-03-27 [1] Bioconductor #> GenomeInfoDbData 1.2.2 2020-01-01 [1] Bioconductor #> GenomicRanges * 1.38.0 2019-10-29 [1] Bioconductor #> glue 1.4.1 2020-05-13 [1] CRAN (R 3.6.2) #> highr 0.8 2019-03-20 [1] CRAN (R 3.6.1) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.1) #> httpuv 1.5.2 2019-09-11 [1] CRAN (R 3.6.1) #> httr 1.4.1 2019-08-05 [1] CRAN (R 3.6.1) #> interactiveDisplayBase 1.24.0 2019-10-29 [1] Bioconductor #> IRanges * 2.20.2 2020-01-13 [1] Bioconductor #> knitr 1.28 2020-02-06 [1] CRAN (R 3.6.2) #> later 1.0.0 2019-10-04 [1] CRAN (R 3.6.1) #> lattice 0.20-41 2020-04-02 [1] CRAN (R 3.6.2) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 3.6.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1) #> Matrix 1.2-18 2019-11-27 [2] CRAN (R 3.6.2) #> matrixStats * 0.56.0 2020-03-13 [1] CRAN (R 3.6.2) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1) #> mime 0.9 2020-02-04 [1] CRAN (R 3.6.2) #> pillar 1.4.4 2020-05-05 [1] CRAN (R 3.6.2) #> pkgbuild 1.0.8 2020-05-07 [1] CRAN (R 3.6.2) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.1) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.1) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.2) #> processx 3.4.2 2020-02-09 [1] CRAN (R 3.6.2) #> promises 1.1.0 2019-10-04 [1] CRAN (R 3.6.1) #> ps 1.3.3 2020-05-08 [1] CRAN (R 3.6.2) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 3.6.2) #> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.1) #> rappdirs 0.3.1 2016-03-28 [1] CRAN (R 3.6.1) #> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 3.6.2) #> RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 3.6.2) #> remotes 2.1.1 2020-02-15 [1] CRAN (R 3.6.2) #> rlang 0.4.6 2020-05-02 [1] CRAN (R 3.6.2) #> rmarkdown 2.1 2020-01-20 [1] CRAN (R 3.6.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1) #> RSQLite 2.2.0 2020-01-07 [1] CRAN (R 3.6.2) #> S4Vectors * 0.24.4 2020-04-09 [1] Bioconductor #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1) #> shiny 1.4.0.2 2020-03-13 [1] CRAN (R 3.6.2) #> SingleR * 1.0.6 2020-04-08 [1] Bioconductor #> stringi 1.4.6 2020-02-17 [1] CRAN (R 3.6.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.1) #> SummarizedExperiment * 1.16.1 2019-12-19 [1] Bioconductor #> testthat 2.3.2 2020-03-02 [1] CRAN (R 3.6.2) #> tibble 3.0.1 2020-04-20 [1] CRAN (R 3.6.2) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 3.6.2) #> usethis 1.6.1 2020-04-29 [1] CRAN (R 3.6.2) #> vctrs 0.3.0 2020-05-11 [1] CRAN (R 3.6.2) #> withr 2.2.0 2020-04-20 [1] CRAN (R 3.6.2) #> xfun 0.13 2020-04-13 [1] CRAN (R 3.6.2) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.1) #> XVector 0.26.0 2019-10-29 [1] Bioconductor #> yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.2) #> zlibbioc 1.32.0 2019-10-29 [1] Bioconductor #> #> [1] /stornext/Home/data/allstaff/h/hickey/R/x86_64-pc-linux-gnu-library/3.6 #> [2] /stornext/System/data/apps/R/openBLAS/R-3.6.2/lib64/R/library ```

BioC 3.11

suppressPackageStartupMessages(library(SingleR))
test <- HumanPrimaryCellAtlasData()
#> snapshotDate(): 2020-04-27
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
ref <- list(BE = BlueprintEncodeData(), MI = MonacoImmuneData())
#> snapshotDate(): 2020-04-27
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
#> snapshotDate(): 2020-04-27
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache
#> see ?SingleR and browseVignettes('SingleR') for documentation
#> loading from cache

labels_main <- split(
  paste0(
    rep(names(ref), sapply(ref, ncol)),
    ".",
    unlist(lapply(ref, function(x) x$label.main), use.names = FALSE)),
  factor(rep(names(ref), sapply(ref, ncol)), names(ref)))

# Single ref
pred1 <- SingleR(
  test = test,
  ref = ref[[1]],
  labels = labels_main[[1]])
# metadata is present
str(metadata(pred1), 1)
#> List of 2
#>  $ common.genes: chr [1:3347] "FABP4" "ADH1B" "GPD1" "CD36" ...
#>  $ de.genes    :List of 25
# And de.genes metdata has useful names
str(metadata(pred1)$de.genes, 1)
#> List of 25
#>  $ BE.Adipocytes       :List of 25
#>  $ BE.Astrocytes       :List of 25
#>  $ BE.B-cells          :List of 25
#>  $ BE.CD4+ T-cells     :List of 25
#>  $ BE.CD8+ T-cells     :List of 25
#>  $ BE.Chondrocytes     :List of 25
#>  $ BE.DC               :List of 25
#>  $ BE.Endothelial cells:List of 25
#>  $ BE.Eosinophils      :List of 25
#>  $ BE.Epithelial cells :List of 25
#>  $ BE.Erythrocytes     :List of 25
#>  $ BE.Fibroblasts      :List of 25
#>  $ BE.HSC              :List of 25
#>  $ BE.Keratinocytes    :List of 25
#>  $ BE.Macrophages      :List of 25
#>  $ BE.Melanocytes      :List of 25
#>  $ BE.Mesangial cells  :List of 25
#>  $ BE.Monocytes        :List of 25
#>  $ BE.Myocytes         :List of 25
#>  $ BE.Neurons          :List of 25
#>  $ BE.Neutrophils      :List of 25
#>  $ BE.NK cells         :List of 25
#>  $ BE.Pericytes        :List of 25
#>  $ BE.Skeletal muscle  :List of 25
#>  $ BE.Smooth muscle    :List of 25

# Multiple refs
pred_list <- SingleR(
  test = test,
  ref = ref,
  labels = labels_main)
# metadata is lost
str(metadata(pred_list), 1)
#>  list()

Created on 2020-05-15 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.0 (2020-04-24) #> os Ubuntu 18.04.4 LTS #> system x86_64, linux-gnu #> ui X11 #> language en_AU:en #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Melbourne #> date 2020-05-15 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> AnnotationDbi 1.50.0 2020-04-27 [1] Bioconductor #> AnnotationHub 2.20.0 2020-04-27 [1] Bioconductor #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) #> backports 1.1.7 2020-05-13 [1] CRAN (R 4.0.0) #> Biobase * 2.48.0 2020-04-27 [1] Bioconductor #> BiocFileCache 1.12.0 2020-04-27 [1] Bioconductor #> BiocGenerics * 0.34.0 2020-04-27 [1] Bioconductor #> BiocManager 1.30.10 2019-11-16 [1] CRAN (R 4.0.0) #> BiocNeighbors 1.6.0 2020-04-27 [1] Bioconductor #> BiocParallel 1.22.0 2020-04-27 [1] Bioconductor #> BiocSingular 1.4.0 2020-04-27 [1] Bioconductor #> BiocVersion 3.11.1 2019-11-13 [1] Bioconductor #> bit 1.1-15.2 2020-02-10 [1] CRAN (R 4.0.0) #> bit64 0.9-7 2017-05-08 [1] CRAN (R 4.0.0) #> bitops 1.0-6 2013-08-17 [1] CRAN (R 4.0.0) #> blob 1.2.1 2020-01-20 [1] CRAN (R 4.0.0) #> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.0) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.0) #> DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.0) #> dbplyr 1.4.3 2020-04-19 [1] CRAN (R 4.0.0) #> DelayedArray * 0.14.0 2020-04-27 [1] Bioconductor #> DelayedMatrixStats 1.10.0 2020-04-27 [1] Bioconductor #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0) #> devtools 2.3.0 2020-04-10 [1] CRAN (R 4.0.0) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0) #> dplyr 0.8.5 2020-03-07 [1] CRAN (R 4.0.0) #> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 4.0.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) #> ExperimentHub 1.14.0 2020-04-27 [1] Bioconductor #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0) #> fastmap 1.0.1 2019-10-08 [1] CRAN (R 4.0.0) #> fs 1.4.1 2020-04-04 [1] CRAN (R 4.0.0) #> GenomeInfoDb * 1.24.0 2020-04-27 [1] Bioconductor #> GenomeInfoDbData 1.2.3 2020-04-30 [1] Bioconductor #> GenomicRanges * 1.40.0 2020-04-27 [1] Bioconductor #> glue 1.4.1 2020-05-13 [1] CRAN (R 4.0.0) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 4.0.0) #> httpuv 1.5.2 2019-09-11 [1] CRAN (R 4.0.0) #> httr 1.4.1 2019-08-05 [1] CRAN (R 4.0.0) #> interactiveDisplayBase 1.26.0 2020-04-27 [1] Bioconductor #> IRanges * 2.22.1 2020-04-28 [1] Bioconductor #> irlba 2.3.3 2019-02-05 [1] CRAN (R 4.0.0) #> knitr 1.28 2020-02-06 [1] CRAN (R 4.0.0) #> later 1.0.0 2019-10-04 [1] CRAN (R 4.0.0) #> lattice 0.20-41 2020-04-02 [4] CRAN (R 4.0.0) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0) #> Matrix 1.2-18 2019-11-27 [4] CRAN (R 4.0.0) #> matrixStats * 0.56.0 2020-03-13 [1] CRAN (R 4.0.0) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0) #> mime 0.9 2020-02-04 [1] CRAN (R 4.0.0) #> pillar 1.4.4 2020-05-05 [1] CRAN (R 4.0.0) #> pkgbuild 1.0.8 2020-05-07 [1] CRAN (R 4.0.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 4.0.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0) #> processx 3.4.2 2020-02-09 [1] CRAN (R 4.0.0) #> promises 1.1.0 2019-10-04 [1] CRAN (R 4.0.0) #> ps 1.3.3 2020-05-08 [1] CRAN (R 4.0.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0) #> rappdirs 0.3.1 2016-03-28 [1] CRAN (R 4.0.0) #> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 4.0.0) #> RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 4.0.0) #> remotes 2.1.1 2020-02-15 [1] CRAN (R 4.0.0) #> rlang 0.4.6 2020-05-02 [1] CRAN (R 4.0.0) #> rmarkdown 2.1 2020-01-20 [1] CRAN (R 4.0.0) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0) #> RSQLite 2.2.0 2020-01-07 [1] CRAN (R 4.0.0) #> rsvd 1.0.3 2020-02-17 [1] CRAN (R 4.0.0) #> S4Vectors * 0.26.0 2020-04-27 [1] Bioconductor #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) #> shiny 1.4.0.2 2020-03-13 [1] CRAN (R 4.0.0) #> SingleR * 1.2.2 2020-05-08 [1] Bioconductor #> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) #> SummarizedExperiment * 1.18.1 2020-04-30 [1] Bioconductor #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0) #> tibble 3.0.1 2020-04-20 [1] CRAN (R 4.0.0) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0) #> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.0) #> vctrs 0.3.0 2020-05-11 [1] CRAN (R 4.0.0) #> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0) #> xfun 0.13 2020-04-13 [1] CRAN (R 4.0.0) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.0.0) #> XVector 0.28.0 2020-04-27 [1] Bioconductor #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) #> zlibbioc 1.34.0 2020-04-27 [1] Bioconductor #> #> [1] /home/peter/R/x86_64-pc-linux-gnu-library/4.0 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```
LTLA commented 4 years ago

This is a combination of an intended code change without a corresponding documentation change.

When you supply multiple references, SingleR() is performing annotation on each individual reference and then combining the results across references. There is never a separate marker estimation step involving all of the references at once; the markers in use are literally the markers computed from the individual references.

As such, if you want to get the markers for the individual labels, you need to think in terms of the individual references. Fortunately, we also report all the results for each individual reference, so you can just fetch them directly:

metadata(pred_list$orig.results$BE)$de.genes
# blah blah blah
metadata(pred_list$orig.results$MI)$de.genes
# etc. etc. etc.

As you've noticed, the previous version of SingleR didn't quite do the right thing with respect to combining the metadata, but as we were fixing it, we decided that we didn't need it at all; you can just go to the orig.results to fetch the de.genes, which is even better because it avoids any potential ambiguities from identical labels across multiple references.

PeteHaitch commented 4 years ago

Sweet, that corrects my mental model of the results and makes sense. And also means I can simplify the labels stuff since it preserves the reference name.

akramdi commented 4 years ago

Hello,

I am here after asking #121, thanks for the clarification!
What if we need the de.genes of the highest scoring labels? Is there a way to know which reference scored best for which label?

dtm2451 commented 4 years ago

Currently, the way to identify which reference scored best for each label is to manually check which column has higher scores within the top level $scores matrix -- the columns for ref#1 come before those of ref#2 and so on.

But this may become easier shortly: we've been discussing updating the plotScore functions in ways that would make reference-origin more apparent. @LTLA, something that I think could make such plotScore functionality simpler, and that I think would make this task easier for @akramdi, would be the addition of a $labels.ref (name?) column, containing the names of the source references, to combined results DataFrames.

LTLA commented 4 years ago

I thought we already did this.

Is there a way to know which reference scored best for which label?

See the references column in the top-level DataFrame.

Here's a little trick to split your DataFrame by the combination of label and reference:

grouping <- paste0(pred$labels, ".", pred$reference)
split(pred, grouping)

And then you can just loop over that to make heatmaps of the relevant markers.

akramdi commented 4 years ago

Perfect, thank you !