Closed emdann closed 1 year ago
Hey @emdann,
Seems like this relates to the ambiguity of backed containers that have parts that are not backed. So far this is not specified and should probably be treated more like undefined behaviour. This also somewhat relates to https://github.com/scverse/muon/issues/19 adding to the complexity of backed objects in their current version.
The workflow you describe is supported by the current versions of mudata by loading and writing back individual modalities:
adata = mudata.read_h5ad('data/pbmc10k_multiome.h5mu', mod='rna', backed=False)
sc.pp.normalize_total(adata, target_sum=10e4)
sc.pp.log1p(adata)
mudata.write_h5ad('data/pbmc10k_multiome.h5mu', 'rna', adata)
That being said, I tried to add checks that .X
is also written when a modality is not backed. So your code should work as you expected now.
Hello, A good use-case for loading h5mu objects in backed mode, is to handle/preprocess a single modality, while keeping the other modalities in backed mode. However in the current implementation
write_h5mu
doesn't allow to re-write a modified/processed.X
for a single modality if the MuData object is in backed modeExample
Now the
.X
of the RNA modality stores the normalized dataIf I save and reload the
.X
stores the raw countsPeeking at the code it looks like the
write_h5mu
only checks if the full MuData is backed, not if individual modality objects are backed.https://github.com/scverse/mudata/blob/83188a3c38d36b7056a98f650010bbb313eca1ba/mudata/_core/mudata.py#L1176-L1181
A quick workaround here is to save the processed/modified data matrix in
mdata.mod['rna'].layers
, but the current behaviour can be confusing. Either a fix or an informative warning (i.e. flagging that since object is in backed mode data matrices are not over-written) would be useful here.System