scverse / anndata

Annotated data.
http://anndata.readthedocs.io
BSD 3-Clause "New" or "Revised" License
575 stars 152 forks source link

ValueError: '_index' is a reserved name for dataframe columns. Above error raised while writing key 'var' of to / #990

Closed denvercal1234GitHub closed 9 months ago

denvercal1234GitHub commented 1 year ago

Hi there,

I was trying to write my adata with write_h5ad but it threw an error as below.

Would you mind helping me fix this issue?

Thank you.

My AnnData object:

Screenshot 2023-05-04 at 17 13 52
loomGEX_adata.write_h5ad('...Velocity_scVelo_Objects/loomGEX_adata.h5ad')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:246](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:246), in report_write_key_on_error..func_wrapper(*args, **kwargs)
    245 try:
--> 246     return func(*args, **kwargs)
    247 except Exception as e:

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:311](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:311), in Writer.write_elem(self, store, k, elem, dataset_kwargs, modifiers)
    310 else:
--> 311     return write_func(store, k, elem, dataset_kwargs=dataset_kwargs)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:52](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:52), in write_spec..decorator..wrapper(g, k, *args, **kwargs)
     50 @wraps(func)
     51 def wrapper(g, k, *args, **kwargs):
---> 52     result = func(g, k, *args, **kwargs)
     53     g[k].attrs.setdefault("encoding-type", spec.encoding_type)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/methods.py:560](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/methods.py:560), in write_dataframe(f, key, df, _writer, dataset_kwargs)
    559     if reserved in df.columns:
--> 560         raise ValueError(f"{reserved!r} is a reserved name for dataframe columns.")
    561 group = f.create_group(key)

ValueError: '_index' is a reserved name for dataframe columns.

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[13], line 2
      1 # Save file after running scv.tl.recover_dynamics
----> 2 Tonsil_cd8_loomGEX_adata.write_h5ad('[/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/DATA/RNA_Velocity_scVelo_Objects/Tonsil_cd8_loomGEX_adata.h5ad](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/DATA/RNA_Velocity_scVelo_Objects/Tonsil_cd8_loomGEX_adata.h5ad)')
      3 #adata = scv.read('data[/pancreas.h5ad](https://file+.vscode-resource.vscode-cdn.net/pancreas.h5ad)')

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_core/anndata.py:1951](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_core/anndata.py:1951), in AnnData.write_h5ad(self, filename, compression, compression_opts, as_dense)
   1948 if filename is None:
   1949     filename = self.filename
-> 1951 _write_h5ad(
   1952     Path(filename),
   1953     self,
   1954     compression=compression,
   1955     compression_opts=compression_opts,
   1956     as_dense=as_dense,
   1957 )
   1959 if self.isbacked:
   1960     self.file.filename = filename

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/h5ad.py:91](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/h5ad.py:91), in write_h5ad(filepath, adata, as_dense, dataset_kwargs, **kwargs)
     87     write_elem(
     88         f, "raw[/varm](https://file+.vscode-resource.vscode-cdn.net/varm)", dict(adata.raw.varm), dataset_kwargs=dataset_kwargs
     89     )
     90 elif adata.raw is not None:
---> 91     write_elem(f, "raw", adata.raw, dataset_kwargs=dataset_kwargs)
     92 write_elem(f, "obs", adata.obs, dataset_kwargs=dataset_kwargs)
     93 write_elem(f, "var", adata.var, dataset_kwargs=dataset_kwargs)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:353](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:353), in write_elem(store, k, elem, dataset_kwargs)
    329 def write_elem(
    330     store: GroupStorageType,
    331     k: str,
   (...)
    334     dataset_kwargs: Mapping = MappingProxyType({}),
    335 ) -> None:
    336     """
    337     Write an element to a storage group using anndata encoding.
    338 
   (...)
    351         E.g. for zarr this would be `chunks`, `compressor`.
    352     """
--> 353     Writer(_REGISTRY).write_elem(store, k, elem, dataset_kwargs=dataset_kwargs)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:248](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:248), in report_write_key_on_error..func_wrapper(*args, **kwargs)
    246     return func(*args, **kwargs)
    247 except Exception as e:
--> 248     re_raise_error(e, elem, key)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:246](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:246), in report_write_key_on_error..func_wrapper(*args, **kwargs)
    244         break
    245 try:
--> 246     return func(*args, **kwargs)
    247 except Exception as e:
    248     re_raise_error(e, elem, key)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:311](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:311), in Writer.write_elem(self, store, k, elem, dataset_kwargs, modifiers)
    302     return self.callback(
    303         write_func,
    304         store,
   (...)
    308         iospec=self.registry.get_spec(elem),
    309     )
    310 else:
--> 311     return write_func(store, k, elem, dataset_kwargs=dataset_kwargs)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:52](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/registry.py:52), in write_spec..decorator..wrapper(g, k, *args, **kwargs)
     50 @wraps(func)
     51 def wrapper(g, k, *args, **kwargs):
---> 52     result = func(g, k, *args, **kwargs)
     53     g[k].attrs.setdefault("encoding-type", spec.encoding_type)
     54     g[k].attrs.setdefault("encoding-version", spec.encoding_version)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/methods.py:261](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/specs/methods.py:261), in write_raw(f, k, raw, _writer, dataset_kwargs)
    259 g = f.create_group(k)
    260 _writer.write_elem(g, "X", raw.X, dataset_kwargs=dataset_kwargs)
--> 261 _writer.write_elem(g, "var", raw.var, dataset_kwargs=dataset_kwargs)
    262 _writer.write_elem(g, "varm", dict(raw.varm), dataset_kwargs=dataset_kwargs)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:248](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:248), in report_write_key_on_error..func_wrapper(*args, **kwargs)
    246     return func(*args, **kwargs)
    247 except Exception as e:
--> 248     re_raise_error(e, elem, key)

File [~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:229](https://file+.vscode-resource.vscode-cdn.net/Users/stillhere/Documents/01_DPhil/scRNAseq/T230T240T246_CXCR5Project/Final_Objects_preAzimuth_2023Feb23/RNA_Velocity_scVelo/RNA_Velocity/SCRIPTS/~/.pyenv/versions/3.10.11/envs/RNAVELO_310/lib/python3.10/site-packages/anndata/_io/utils.py:229), in report_write_key_on_error..re_raise_error(e, elem, key)
    227 else:
    228     parent = _get_parent(elem)
--> 229     raise type(e)(
    230         f"{e}\n\n"
    231         f"Above error raised while writing key {key!r} of {type(elem)} "
    232         f"to {parent}"
    233     ) from e

ValueError: '_index' is a reserved name for dataframe columns.

Above error raised while writing key 'var' of  to /
ivirshup commented 1 year ago

It looks like there may be a column in adata.raw.var named "_index". Removing, or renaming it, should fix the issue.

weir12 commented 1 year ago

Hi, I believe this is a common issue encountered by anndata users, as various software and operations can inadvertently alter the index of var, leading to save errors even when the object remains in memory. In my opinion, this contradicts most programmers' intuition: if an object is valid in memory, it should be allowed to persist or at least trigger a warning during saving instead of outright failure. This situation often proves frustrating, particularly when encountering a fatal error at the end result where one must exercise caution by checking or setting breakpoints before saving.

Thank you.

flying-sheep commented 11 months ago

if an object is valid in memory, it should be allowed to persist or at least trigger a warning during saving instead of outright failure.

Oh I wish it was that easy. But attempting to make sure that all those nested objects trigger some kind of hook in their parent AnnData object when modified (which would allow us to catch incompatibilities before they end up in the AnnData object) is sadly very hard.

We can certainly work on making messages friendlier, like e.g. the borked second one here:

ValueError: '_index' is a reserved name for dataframe columns.

Above error raised while writing key 'var' of  to /

What does of to mean? Clearly something should be between these spaces. I’m confused how that can be an empty string. Which type returns '' when stringified?

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!

flying-sheep commented 9 months ago

OK, the poor error messages should be fixed since https://github.com/scverse/anndata/pull/1273