drisso / SingleCellExperiment

Clone of the Bioconductor repository for the SingleCellExperiment package, see https://bioconductor.org/packages/devel/bioc/html/SingleCellExperiment.html for the official development version.
65 stars 18 forks source link

Directly storing size factors in the colData #45

Closed LTLA closed 4 years ago

LTLA commented 4 years ago

This PR directly stores the size factors in the colData as a "sizeFactor" column, thus bringing us in line with how DESeq2 handles size factors. Not that I would consider DESeq2 to be a standard or anything, but I will grudgingly admit that is convenient to have the size factors as a regular field that people can operate on in various scater or ggplot2 calls.

The reducedDims and altExps do not get the same treatment with respect to colData exposure. These are internal fields with complex structures (especially the altExps!) and the user really shouldn't be touching them with anything but the appropriate getters and setters.

No action is required from users or developers, provided I have written updateObject() correctly.

Incidentally, this PR also removes the deprecated spike-in handlers, fully replacing them with altExps.

codecov-io commented 4 years ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (master@432da5a). Click here to learn what that means. The diff coverage is 70.27%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #45   +/-   ##
=========================================
  Coverage          ?   92.54%           
=========================================
  Files             ?       19           
  Lines             ?      550           
  Branches          ?        0           
=========================================
  Hits              ?      509           
  Misses            ?       41           
  Partials          ?        0
Impacted Files Coverage Δ
R/miscellaneous.R 100% <ø> (ø)
R/validity.R 100% <ø> (ø)
R/colLabels.R 100% <100%> (ø)
R/AllGenerics.R 94.44% <100%> (ø)
R/sizeFactors.R 100% <100%> (ø)
R/utils.R 100% <100%> (ø)
R/updateObject.R 24.13% <31.25%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 432da5a...50f4d38. Read the comment docs.

LTLA commented 4 years ago

One thing that downstream users have to be aware of is a new "sizeFactor" field appearing in their colData. This is mostly harmless unless (i) people are referring to columns of the colData by integer indices rather than by name (which is probably ill-advised in anything other than function internals that auto-discover the appropriate index) or (ii) developers have hard-coded column names in their tests.

LTLA commented 4 years ago

While I'm here, I also added a colLabels() getter and setter to give a default location for cell labels, e.g., from clusters or classification algorithms. This can be considered to be a less opinionated counterpart to Seurat's Ident function, and aims to provide downstream functions with a "default" location for cell labels without compromising their programmatic flexibility.