Open brainfo opened 1 year ago
A similar issue was brought up on the discourse.
An easy way to work around this is to store your data using the zarr
format instead of hdf5
(e.g. anndata.read_zarr
, anndata.write_zarr
).
A better solution will take some effort. Here's some prior discussion from h5py
: https://github.com/h5py/h5py/issues/1053. The maximum size for metadata on an hdf5 object can be increased using the H5Pset_attr_phase_change
function in the C API. h5py
has wrapped this at the cython level, but has not exposed this from the main API (https://github.com/h5py/h5py/pull/1638).
I believe we would need to:
h5py
python apiI have the same issue. The zarr workaround works.
Being able to store sparse matrices with a certain chunksize would be great though. I think that's not possible atm.
Due to this line
Maybe this comment could help.
This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!
@selmanozleyen, this is the h5py
issue I was talking about. Do you think you could take a look at this?
If for pearson residue it's feasible to store the pearson_residual_df as a layer and other parameter values in uns?
@brainfo, oh, for sure. If it's a cells x genes
dataframe I think you could just put it as a numpy array into layers, then call adata.to_df(layer="pearson_residuals")
whenever you need the dataframe. I believe this should be zero-copy.
Minimal code sample (that we can copy&paste without having any data)
Write any anndata with pearson residuals in uns
The pearson_residual_df looks like this, with 38291 rows (obs) and 5000 columns (features) :
Versions