scverse / anndata

Annotated data.
http://anndata.readthedocs.io
BSD 3-Clause "New" or "Revised" License
571 stars 152 forks source link

0.10.6 - Cannot write files due to implicit conversion to str dtype. #1417

Open jday1 opened 6 months ago

jday1 commented 6 months ago

Please make sure these conditions are met

Report

The new ValueError check introduced in 0.10.6 causes anndata.Anndata to error out on write.

This is because the index is implicitly converted to str by anndata/_core/aligned_df.py:_gen_dataframe_df.

To prevent this, one needs to also do this implicit conversion before performing this check. I have done this in PR #1418 which currently has failing tests and is in draft.

Code:


import anndata
import pandas as pd

df = pd.DataFrame({"column1": [1, 2, 3]})

df.index = df.column1

adata = anndata.AnnData(obs=df)

adata.write_h5ad("cannot_write.h5ad")

Traceback:

.../lib/python3.10/site-packages/anndata/_io/specs/methods.py
ValueError: DataFrame.index.name ('column1') is also used by a column whose values are different. This is not supported. Please make sure the values are the same, or use a different name.
Error raised while writing key 'obs' of <class 'h5py._hl.group.Group'> to /

Versions

-----
anndata             0.10.6
session_info        1.0.0
-----
astunparse          1.6.3
awkward             2.3.2
awkward_cpp         NA
cython_runtime      NA
dateutil            2.8.2
dill                0.3.7
exceptiongroup      1.1.2
h5py                3.10.0
mpmath              1.3.0
natsort             8.4.0
numexpr             2.9.0
numpy               1.26.4
packaging           23.1
pandas              2.2.0
pyarrow             15.0.0
pynvml              11.5.0
pytz                2024.1
regex               2.5.140
scipy               1.12.0
setuptools_scm      NA
six                 1.16.0
sympy               1.12
torch               2.1.0+cu121
torchgen            NA
tqdm                4.66.2
typing_extensions   NA
zope                NA
-----
Python 3.10.4 (main, May  3 2023, 14:08:02) [GCC 11.3.0]
Linux-6.6.10-76060610-generic-x86_64-with-glibc2.35
-----
Session information updated at 2024-03-12 14:21
ivirshup commented 6 months ago

Do you have a use case this is causing to fail? I see how this could work, but it seems error prone when doing joins or reseting the index.

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!