Open tomsing1 opened 1 year ago
P.S.: Just to spin this thought a little further: When objects get very large, and the assayData is stored in HDF5 files, then we could support reading data with the HDF5Array::loadHDF5SummarizedExperiment()
function. Just another example where the hardcoded readRDS()
call might not be sufficient.
Hey @tomsing1 , probably this is not the only way one could address your idea.
But - have a look at https://github.com/iSEE/iSEEindex/pull/62, which realizes a first implementation to not just have a path
to a file (and then readSCE
that, no matter what), but also enables pretty much any call done via R code that would already give you an S(C)E object.
That can also cover your case where you would load an sce
object from hdf5 file formats or similar, or can be used to serve full data packages, where each individual dataset can be explored separately.
A possible rework of the whole could be with an additional field in the yaml configuration file. Think of a "type"-like of the resource, and the dispatch of what happens to "load that sce" is done based on that value.
Happy to think more out loud with you if needed, feel free to give the fresh devel branch a spin!
Federico
Right now, SummarizedExperiments are loaded from RDS files with the
readRDS
function. https://github.com/iSEE/iSEEindex/blob/6ee6b8b7e98c0aa2704602893c2073c3f9fea455/R/utils-datasets.R#L43It might be worth considering supporting other file formats, e.g.
.qs
files generated with the qs. That might speed up loading large objects. Maybe a simple switch based on file extensions (.rds
vs.qs
) would be sufficient?