Closed LTLA closed 3 years ago
I can't get the tests to run locally but I think that might be my setup. I assume you have checked this?
Should have converted to a draft, actually; can you see if there's any way to stuff in a H5Dataset in there?
Well, bum. This still closes #13 but I failed at the more ambitious task to slide HDF5Array
s into the AnnData
as a h5py.Dataset
. Even from within Python itself, assigning a h5py.Dataset
to adobj.X
, for example, will convert the former into a Numpy array. Poking around the source suggests that the AnnData class itself has some special provisions for H5AD files to support HDF5 backing (e.g., adobj.file
), with the implication being that we can't just shove arbitrary HDF5-backed arrays in there. Oh well.
My brain is dead from EuroBioc so not sure I followed all of that but checks run locally for me so I'll merge.
Basically, the remaining task is to figure out how to transfer an R-side HDF5Array
inside a SCE to a h5py.Dataset
inside an AnnData
object. The problem is that my attempts to assign a h5py.Dataset
to the .X
member of an AnnData
object have always lead to the realization of the former into a Numpy array, which defeats the point of having HDF5-backed data.
Closes #13. When converting from AnnData to SCE, if the former has HDF5-backed matrices, these automatically cause the creation of
HDF5Array
s (but only when those backed matrices are opened withmode="r"
). This avoids any need to load the data into memory prior to the transfer into R.In addition, there is some slightly better support for the reverse process. I couldn't figure out how to get the
AnnData()
constructor to accept ah5py.Dataset
, but I did load the HDF5 file's contents into a Numpy array directly in Python; this avoids the need to load it in R before transferring to Python and saves a copy.(Ideally, though, we would have AnnData accept a backed Dataset, as this is the direct analogue for a
HDF5Array
.)