Closed PeteHaitch closed 4 years ago
This is a combination of an intended code change without a corresponding documentation change.
When you supply multiple references, SingleR()
is performing annotation on each individual reference and then combining the results across references. There is never a separate marker estimation step involving all of the references at once; the markers in use are literally the markers computed from the individual references.
As such, if you want to get the markers for the individual labels, you need to think in terms of the individual references. Fortunately, we also report all the results for each individual reference, so you can just fetch them directly:
metadata(pred_list$orig.results$BE)$de.genes
# blah blah blah
metadata(pred_list$orig.results$MI)$de.genes
# etc. etc. etc.
As you've noticed, the previous version of SingleR
didn't quite do the right thing with respect to combining the metadata
, but as we were fixing it, we decided that we didn't need it at all; you can just go to the orig.results
to fetch the de.genes
, which is even better because it avoids any potential ambiguities from identical labels across multiple references.
Sweet, that corrects my mental model of the results and makes sense.
And also means I can simplify the labels
stuff since it preserves the reference name.
Hello,
I am here after asking #121, thanks for the clarification!
What if we need the de.genes
of the highest scoring labels? Is there a way to know which reference scored best for which label?
Currently, the way to identify which reference scored best for each label is to manually check which column has higher scores within the top level $scores
matrix -- the columns for ref#1 come before those of ref#2 and so on.
But this may become easier shortly: we've been discussing updating the plotScore
functions in ways that would make reference-origin more apparent. @LTLA, something that I think could make such plotScore functionality simpler, and that I think would make this task easier for @akramdi, would be the addition of a $labels.ref
(name?) column, containing the names of the source references, to combined results DataFrames.
I thought we already did this.
Is there a way to know which reference scored best for which label?
See the references
column in the top-level DataFrame
.
Here's a little trick to split your DataFrame by the combination of label and reference:
grouping <- paste0(pred$labels, ".", pred$reference)
split(pred, grouping)
And then you can just loop over that to make heatmaps of the relevant markers.
Perfect, thank you !
There's a change of behaviour between BioC 3.10 and 3.11, but neither seems correct.
Desired behaviour
metadata(pred_list)$de.genes[["MI.Basophils"]]
(#121 is related, I think).BioC 3.10
Created on 2020-05-15 by the reprex package (v0.3.0)
Session info
``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 3.6.2 (2019-12-12) #> os CentOS Linux 7 (Core) #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Australia/Melbourne #> date 2020-05-15 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> AnnotationDbi 1.48.0 2019-10-29 [1] Bioconductor #> AnnotationHub 2.18.0 2019-10-29 [1] Bioconductor #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1) #> backports 1.1.7 2020-05-13 [1] CRAN (R 3.6.2) #> Biobase * 2.46.0 2019-10-29 [1] Bioconductor #> BiocFileCache 1.10.2 2019-11-08 [1] Bioconductor #> BiocGenerics * 0.32.0 2019-10-29 [1] Bioconductor #> BiocManager 1.30.10 2019-11-16 [1] CRAN (R 3.6.1) #> BiocNeighbors 1.4.2 2020-02-29 [1] Bioconductor #> BiocParallel * 1.20.1 2019-12-21 [1] Bioconductor #> BiocVersion 3.10.1 2019-06-06 [1] Bioconductor #> bit 1.1-15.2 2020-02-10 [1] CRAN (R 3.6.2) #> bit64 0.9-7 2017-05-08 [1] CRAN (R 3.6.1) #> bitops 1.0-6 2013-08-17 [1] CRAN (R 3.6.1) #> blob 1.2.1 2020-01-20 [1] CRAN (R 3.6.2) #> callr 3.4.3 2020-03-28 [1] CRAN (R 3.6.2) #> cli 2.0.2 2020-02-28 [1] CRAN (R 3.6.2) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1) #> curl 4.3 2019-12-02 [1] CRAN (R 3.6.1) #> DBI 1.1.0 2019-12-15 [1] CRAN (R 3.6.2) #> dbplyr 1.4.3 2020-04-19 [1] CRAN (R 3.6.2) #> DelayedArray * 0.12.3 2020-04-09 [1] Bioconductor #> DelayedMatrixStats 1.8.0 2019-10-29 [1] Bioconductor #> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1) #> devtools 2.3.0 2020-04-10 [1] CRAN (R 3.6.2) #> digest 0.6.25 2020-02-23 [1] CRAN (R 3.6.2) #> dplyr 0.8.5 2020-03-07 [1] CRAN (R 3.6.2) #> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.1) #> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1) #> ExperimentHub 1.12.0 2019-10-29 [1] Bioconductor #> fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.2) #> fastmap 1.0.1 2019-10-08 [1] CRAN (R 3.6.1) #> fs 1.4.1 2020-04-04 [1] CRAN (R 3.6.2) #> GenomeInfoDb * 1.22.1 2020-03-27 [1] Bioconductor #> GenomeInfoDbData 1.2.2 2020-01-01 [1] Bioconductor #> GenomicRanges * 1.38.0 2019-10-29 [1] Bioconductor #> glue 1.4.1 2020-05-13 [1] CRAN (R 3.6.2) #> highr 0.8 2019-03-20 [1] CRAN (R 3.6.1) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.1) #> httpuv 1.5.2 2019-09-11 [1] CRAN (R 3.6.1) #> httr 1.4.1 2019-08-05 [1] CRAN (R 3.6.1) #> interactiveDisplayBase 1.24.0 2019-10-29 [1] Bioconductor #> IRanges * 2.20.2 2020-01-13 [1] Bioconductor #> knitr 1.28 2020-02-06 [1] CRAN (R 3.6.2) #> later 1.0.0 2019-10-04 [1] CRAN (R 3.6.1) #> lattice 0.20-41 2020-04-02 [1] CRAN (R 3.6.2) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 3.6.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1) #> Matrix 1.2-18 2019-11-27 [2] CRAN (R 3.6.2) #> matrixStats * 0.56.0 2020-03-13 [1] CRAN (R 3.6.2) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1) #> mime 0.9 2020-02-04 [1] CRAN (R 3.6.2) #> pillar 1.4.4 2020-05-05 [1] CRAN (R 3.6.2) #> pkgbuild 1.0.8 2020-05-07 [1] CRAN (R 3.6.2) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.1) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.1) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.2) #> processx 3.4.2 2020-02-09 [1] CRAN (R 3.6.2) #> promises 1.1.0 2019-10-04 [1] CRAN (R 3.6.1) #> ps 1.3.3 2020-05-08 [1] CRAN (R 3.6.2) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 3.6.2) #> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.1) #> rappdirs 0.3.1 2016-03-28 [1] CRAN (R 3.6.1) #> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 3.6.2) #> RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 3.6.2) #> remotes 2.1.1 2020-02-15 [1] CRAN (R 3.6.2) #> rlang 0.4.6 2020-05-02 [1] CRAN (R 3.6.2) #> rmarkdown 2.1 2020-01-20 [1] CRAN (R 3.6.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1) #> RSQLite 2.2.0 2020-01-07 [1] CRAN (R 3.6.2) #> S4Vectors * 0.24.4 2020-04-09 [1] Bioconductor #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1) #> shiny 1.4.0.2 2020-03-13 [1] CRAN (R 3.6.2) #> SingleR * 1.0.6 2020-04-08 [1] Bioconductor #> stringi 1.4.6 2020-02-17 [1] CRAN (R 3.6.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.1) #> SummarizedExperiment * 1.16.1 2019-12-19 [1] Bioconductor #> testthat 2.3.2 2020-03-02 [1] CRAN (R 3.6.2) #> tibble 3.0.1 2020-04-20 [1] CRAN (R 3.6.2) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 3.6.2) #> usethis 1.6.1 2020-04-29 [1] CRAN (R 3.6.2) #> vctrs 0.3.0 2020-05-11 [1] CRAN (R 3.6.2) #> withr 2.2.0 2020-04-20 [1] CRAN (R 3.6.2) #> xfun 0.13 2020-04-13 [1] CRAN (R 3.6.2) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.1) #> XVector 0.26.0 2019-10-29 [1] Bioconductor #> yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.2) #> zlibbioc 1.32.0 2019-10-29 [1] Bioconductor #> #> [1] /stornext/Home/data/allstaff/h/hickey/R/x86_64-pc-linux-gnu-library/3.6 #> [2] /stornext/System/data/apps/R/openBLAS/R-3.6.2/lib64/R/library ```BioC 3.11
Created on 2020-05-15 by the reprex package (v0.3.0)
Session info
``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.0 (2020-04-24) #> os Ubuntu 18.04.4 LTS #> system x86_64, linux-gnu #> ui X11 #> language en_AU:en #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Melbourne #> date 2020-05-15 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> AnnotationDbi 1.50.0 2020-04-27 [1] Bioconductor #> AnnotationHub 2.20.0 2020-04-27 [1] Bioconductor #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) #> backports 1.1.7 2020-05-13 [1] CRAN (R 4.0.0) #> Biobase * 2.48.0 2020-04-27 [1] Bioconductor #> BiocFileCache 1.12.0 2020-04-27 [1] Bioconductor #> BiocGenerics * 0.34.0 2020-04-27 [1] Bioconductor #> BiocManager 1.30.10 2019-11-16 [1] CRAN (R 4.0.0) #> BiocNeighbors 1.6.0 2020-04-27 [1] Bioconductor #> BiocParallel 1.22.0 2020-04-27 [1] Bioconductor #> BiocSingular 1.4.0 2020-04-27 [1] Bioconductor #> BiocVersion 3.11.1 2019-11-13 [1] Bioconductor #> bit 1.1-15.2 2020-02-10 [1] CRAN (R 4.0.0) #> bit64 0.9-7 2017-05-08 [1] CRAN (R 4.0.0) #> bitops 1.0-6 2013-08-17 [1] CRAN (R 4.0.0) #> blob 1.2.1 2020-01-20 [1] CRAN (R 4.0.0) #> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.0) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.0) #> DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.0) #> dbplyr 1.4.3 2020-04-19 [1] CRAN (R 4.0.0) #> DelayedArray * 0.14.0 2020-04-27 [1] Bioconductor #> DelayedMatrixStats 1.10.0 2020-04-27 [1] Bioconductor #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0) #> devtools 2.3.0 2020-04-10 [1] CRAN (R 4.0.0) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0) #> dplyr 0.8.5 2020-03-07 [1] CRAN (R 4.0.0) #> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 4.0.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) #> ExperimentHub 1.14.0 2020-04-27 [1] Bioconductor #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0) #> fastmap 1.0.1 2019-10-08 [1] CRAN (R 4.0.0) #> fs 1.4.1 2020-04-04 [1] CRAN (R 4.0.0) #> GenomeInfoDb * 1.24.0 2020-04-27 [1] Bioconductor #> GenomeInfoDbData 1.2.3 2020-04-30 [1] Bioconductor #> GenomicRanges * 1.40.0 2020-04-27 [1] Bioconductor #> glue 1.4.1 2020-05-13 [1] CRAN (R 4.0.0) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 4.0.0) #> httpuv 1.5.2 2019-09-11 [1] CRAN (R 4.0.0) #> httr 1.4.1 2019-08-05 [1] CRAN (R 4.0.0) #> interactiveDisplayBase 1.26.0 2020-04-27 [1] Bioconductor #> IRanges * 2.22.1 2020-04-28 [1] Bioconductor #> irlba 2.3.3 2019-02-05 [1] CRAN (R 4.0.0) #> knitr 1.28 2020-02-06 [1] CRAN (R 4.0.0) #> later 1.0.0 2019-10-04 [1] CRAN (R 4.0.0) #> lattice 0.20-41 2020-04-02 [4] CRAN (R 4.0.0) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0) #> Matrix 1.2-18 2019-11-27 [4] CRAN (R 4.0.0) #> matrixStats * 0.56.0 2020-03-13 [1] CRAN (R 4.0.0) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0) #> mime 0.9 2020-02-04 [1] CRAN (R 4.0.0) #> pillar 1.4.4 2020-05-05 [1] CRAN (R 4.0.0) #> pkgbuild 1.0.8 2020-05-07 [1] CRAN (R 4.0.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 4.0.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0) #> processx 3.4.2 2020-02-09 [1] CRAN (R 4.0.0) #> promises 1.1.0 2019-10-04 [1] CRAN (R 4.0.0) #> ps 1.3.3 2020-05-08 [1] CRAN (R 4.0.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0) #> rappdirs 0.3.1 2016-03-28 [1] CRAN (R 4.0.0) #> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 4.0.0) #> RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 4.0.0) #> remotes 2.1.1 2020-02-15 [1] CRAN (R 4.0.0) #> rlang 0.4.6 2020-05-02 [1] CRAN (R 4.0.0) #> rmarkdown 2.1 2020-01-20 [1] CRAN (R 4.0.0) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0) #> RSQLite 2.2.0 2020-01-07 [1] CRAN (R 4.0.0) #> rsvd 1.0.3 2020-02-17 [1] CRAN (R 4.0.0) #> S4Vectors * 0.26.0 2020-04-27 [1] Bioconductor #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) #> shiny 1.4.0.2 2020-03-13 [1] CRAN (R 4.0.0) #> SingleR * 1.2.2 2020-05-08 [1] Bioconductor #> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) #> SummarizedExperiment * 1.18.1 2020-04-30 [1] Bioconductor #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0) #> tibble 3.0.1 2020-04-20 [1] CRAN (R 4.0.0) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0) #> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.0) #> vctrs 0.3.0 2020-05-11 [1] CRAN (R 4.0.0) #> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0) #> xfun 0.13 2020-04-13 [1] CRAN (R 4.0.0) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.0.0) #> XVector 0.28.0 2020-04-27 [1] Bioconductor #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) #> zlibbioc 1.34.0 2020-04-27 [1] Bioconductor #> #> [1] /home/peter/R/x86_64-pc-linux-gnu-library/4.0 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```