Open ivirshup opened 5 years ago
SingleCellExperiment is more specific (e.g. reducedDims
exists while we have the more generic obsm
), so concepts that are conventions in AnnData aren’t in SCE.
Is there anything point here you’d like to hear my opinion on specifically? :smiley:
I was wondering if you had thoughts on dealing with round-trip conversions when there wasn't clear one to one mappings. For example, going R->python->R
with a SingleCellExperiment with nested dataframes. It's not obvious to me (from here) how you could deal with that. If you flatten, how do know what to unflatten? If you move them to obsm
, how do you know what to move back to colData
? Another example would be the SingleCellExperiment
LinearEmbeddingMatrix
, where the variable loadings never get subset, so it doesn't quite map to varm
.
I don’t handle anything tricky yet :sweat_smile: Almost everything I do is round-trippable (except for the name conversion which changes capitalization and would canonicalize the obsm/reducedDims name of diffusion maps – ad.obsm['X_dm']
→ reducedDim(sce, 'DM')
→ ad.obsm['X_diffmap']
)
What do you mean with flattening? Are there nested data.frames in SCE? What for?
I think it would be good to scope out the requirements of an interchange file format. This could probably start with some ideas of what the use cases are (basic user stories).
Some questions I have about what is reasonably achievable:
loom
, not provide?A little expansion on "conventions v. generality":
In an
AnnData
object we don't have nested data frames, so I would imagine any nested dataframes could just be used as elements ofobsm
. This is probably also where we'd putreducedDims
. How do we keep this information around? We could just know what kinds of names are reduced dimensions, or we'd have to "tag" the arrays.@flying-sheep, from your working with in-memory exchange, do you have any thoughts on this?