frankligy / scTriangulate

scTriangulate is a Python package to mix-and-match conflicting clustering results in single cell analysis and generate reconciled clustering solutions
MIT License
35 stars 5 forks source link

ValueError: '_index' is a reserved name for dataframe columns. #22

Open frankligy opened 1 year ago

frankligy commented 1 year ago

adata.write('test.h5ad')

Traceback (most recent call last):
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/utils.py", line 214, in func_wrapper
    return func(elem, key, val, *args, **kwargs)
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/specs/registry.py", line 175, in write_elem
    _REGISTRY.get_writer(dest_type, t, modifiers)(f, k, elem, *args, **kwargs)
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/specs/registry.py", line 24, in wrapper
    result = func(g, k, *args, **kwargs)
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/specs/methods.py", line 497, in write_dataframe
    raise ValueError(f"{reserved!r} is a reserved name for dataframe columns.")
ValueError: '_index' is a reserved name for dataframe columns.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_core/anndata.py", line 1924, in write_h5ad
    as_dense=as_dense,
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/h5ad.py", line 97, in write_h5ad
    write_elem(f, "raw", adata.raw, dataset_kwargs=dataset_kwargs)
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/utils.py", line 214, in func_wrapper
    return func(elem, key, val, *args, **kwargs)
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/specs/registry.py", line 175, in write_elem
    _REGISTRY.get_writer(dest_type, t, modifiers)(f, k, elem, *args, **kwargs)
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/specs/registry.py", line 24, in wrapper
    result = func(g, k, *args, **kwargs)
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/specs/methods.py", line 259, in write_raw
    write_elem(g, "var", raw.var, dataset_kwargs=dataset_kwargs)
  File "/opt/anaconda3/envs/sctri_env/lib/python3.7/site-packages/anndata/_io/utils.py", line 224, in func_wrapper
    ) from e
ValueError: '_index' is a reserved name for dataframe columns.

Above error raised while writing key 'var' of <class 'h5py._hl.group.Group'> to /
frankligy commented 1 year ago

So, this is an issue related to anndata, not an issue with scTriangulate. It has been detailed in the post like that (https://stackoverflow.com/questions/70234014/valueerror-index-is-a-reserved-name-for-dataframe-columns).

I also encounter writing issue when using anndata, so I implemented a function called make_sure_adata_writable (https://sctriangulate.readthedocs.io/en/latest/api.html#make-sure-adata-writable), this function takes care two thing, (a) make sure the var and obs index doesn't have name, (b) make sure there are no mixed-type column in both var and obs. But unfortunately, the issue raised here hasn't been considered by this function, and more importantly, we need to also consider the adata.raw as well, the raw also contains a copy of var. For now, please use the following code:

make_sure_adata_writable(adata,delete=False)
adata.var.rename(columns={'_index':'index'},inplace=True)
adata.raw.var.rename(columns={'_index':'index'},inplace=True) # if raw slot exists

But I will modify my make_sure_adata_writable function as well in the next release.