Open DriesSchaumont opened 1 year ago
I was wondering what the effect was on .obs
and .var
when saving a anndata file to a modality of an existing mudata. Seems like they get updated:
>>> import pandas as pd
>>> from anndata import AnnData
>>>
>>> def test_mudata():
... df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], index=["obs1", "obs2"], columns=["var1", "var2", "var3"])
... obs = pd.DataFrame([["A", "sample1"], ["B", "sample2"]], index=df.index, columns=["Obs", "sample_id"])
... var = pd.DataFrame([["a", "sample1"], ["b", "sample2"], ["c", "sample1"]],
... index=df.columns, columns=["Feat", "sample_id_var"])
... obsm = {"obsm_key": pd.DataFrame([["foo", "bar"], ["lorem", "ipsum"]],
... index=obs.index, columns=["obsm_col1", "obsm_col2"])}
... ad1 = AnnData(df, obs=obs, var=var, obsm=obsm)
... var2 = pd.DataFrame(["d", "e", "g"], index=df.columns, columns=["Feat"])
... obs2 = pd.DataFrame(["C", "D"], index=df.index, columns=["Obs"])
... ad2 = AnnData(df, obs=obs2, var=var2)
... return mudata.MuData({'mod1': ad1, 'mod2': ad2})
...
>>>
>>> test_data = test_mudata()
/home/di/code/openpipeline/.venv/lib/python3.10/site-packages/mudata/_core/mudata.py:491: UserWarning: Cannot join columns with the same name because var_names are intersecting.
warnings.warn(
>>> test_data.write_h5mu("test.h5mu")
/home/di/code/openpipeline/.venv/lib/python3.10/site-packages/mudata/_core/mudata.py:491: UserWarning: Cannot join columns with the same name because var_names are intersecting.
warnings.warn(
>>>
>>> test_getting_modality = mudata.read("test.h5mu/mod1")
>>> test_getting_modality.obs["test"] = pd.Series(["pekkie", "flip"], name="test_col", index=pd.Index(["obs1", "obs2"]))
>>> mudata.write_h5ad("test.h5mu", mod="mod1", data=test_getting_modality)
>>>
>>> test_result_of_alteration = test_getting_modality = mudata.read("test.h5mu")
/home/di/code/openpipeline/.venv/lib/python3.10/site-packages/mudata/_core/mudata.py:491: UserWarning: Cannot join columns with the same name because var_names are intersecting.
warnings.warn(
>>> test_result_of_alteration.obs
mod1:Obs mod1:sample_id mod1:test mod2:Obs
obs1 A sample1 pekkie C
obs2 B sample2 flip D
One caveat is that the compression of the output files cannot be changes without reading in the whole file. This would mean that we render the --compression
arguments useless. As an alternative a compression component could be implemented?
https://mudata.readthedocs.io/en/latest/api/generated/mudata.read.html