Open davismcc opened 7 years ago
wow, that's big...
Probably something to hold off until BioC 3.6, I think. In many respects, it's a pretty straightforward conversion, based on my knowledge of both classes; but we've already made a whole heap of interface changes, and putting this monster in the same release would annoy downstream users/developers.
There are also a bunch of other issues that should be addressed in this conversion:
fpkmData
and tpmData
and cpmData
into a single entry, probably called cpmData
. They are effectively the same thing: normalized, unlogged expression values..exprs_hunter
. Implicit choice of expression values is opaque, functions (or people who call them) should define the desired expression type explicitly.is_exprs
? After QC, there is no compelling reason to keep such a large matrix around, especially when it can be quickly regenerated.On a related note, we should consider whether we should actually generate exprs
in newSCESet
. Many people are forgetting to call normalize
after running computeSumFactors
in the scran workflow, and this does not cause obvious problems downstream as the exprs
(by library size) are already available. It may be better to ask users to explicitly call normalize
on the constructed SCESet
, with an appropriately shouty warning if size factors are not available.
Recommended after speaking with Martin Morgan.