Closed iosonofabio closed 3 years ago
Could you give us a bit more information about your environment? Something like the output of
from sinfo import sinfo
sinfo(dependencies=True)
This looks a lot like some issues we'd been having with h5py
3.x vs 2.x, but to the best of my knowledge those had been fixed in recent releases of anndata
.
On my machine, in an environment created with:
conda create -yn python3.9 python=3.9
conda activate python3.9
pip install anndata sinfo
>>> import anndata
>>> from sinfo import sinfo
>>> adata = anndata.read_h5ad("./tmp.h5ad")
>>> adata.var_names
Index(['Rp1', 'Sox17', 'Lypla1', 'Gm37988', 'Tcea1', 'Rgs20', 'Atp6v1h',
'Rb1cc1', '4732440D04Rik', 'St18',
...
'Uty', 'Ddx3y', 'Dcc', 'Gm960', 'Slc22a12', 'Ptgdr2', 'Slit1', 'Sec31b',
'E330013P04Rik', 'cdh5-Tdtomato'],
dtype='object', name='index', length=17132)
>>> sinfo(dependencies=True)
-----
anndata 0.7.5
sinfo 0.3.1
-----
anndata 0.7.5
cython_runtime NA
dateutil 2.8.1
h5py 3.1.0
natsort 7.1.1
numpy 1.20.1
packaging 20.9
pandas 1.2.2
pytz 2021.1
scipy 1.6.0
sinfo 0.3.1
six 1.15.0
-----
Python 3.9.1 (default, Dec 11 2020, 06:28:49) [Clang 10.0.0 ]
macOS-10.15.7-x86_64-i386-64bit
16 logical CPU cores, i386
-----
Session information updated at 2021-02-10 14:26
Thank you:
import anndata
import sinfo
ModuleNotFoundError: No module named 'sinfo'
Not using conda, if that was your question.
Might this be useful?
anndata.__version__
'0.7.4'
Yep, it's fixed in 0.7.5, closing, thank you.
Thanks for the update, glad the issue is fixed!
It seems to have reappeared when upgrading the h5py to version 3.3.
@liuzj039, I'm not seeing this behavior with h5py 3.3. Could you open a new issue with a replicable example of what you're seeing?
@liuzj039, I'm not seeing this behavior with h5py 3.3. Could you open a new issue with a replicable example of what you're seeing?
Oh, and when I downgraded my h5py to 3.1 and then upgraded to 3.3, it was fixed. It seems to be caused by my terrible environment. Many thx!
As an aside, I was having a similar issue in a different context, but going from AnnData 0.7.4 to 0.7.6 seems to have fixed it. Thank you.
Hi all,
Thanks for the amazing package!
Just updated to Python 3.9 since numba has fixed their side last month. Most
adata
strings (e.g.var_names
,obs_names
, and all column contents in the respective dataframes) are now parsed asbytes
:I skimmed through
anndata
's code and found there is already some fiddling with string encoding, so I suspect something needs fixing there (read_series
or thereabout).Of note, the names of the columns of both
adata.var
andadata.obs
are correctly parsed as strings, not bytes. Not sure why that would be, one would expect them to undergo the same treatment as the metadata itself?Thank you in advance, Fabio
edit: that seems related to this change in pandas 1.2:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html