icbi-lab / infercnvpy

Infer copy number variation (CNV) from scRNA-seq data. Plays nicely with Scanpy.
https://infercnvpy.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
135 stars 27 forks source link

gtfparse dependency causes issues with latest numpy, pandas and pyarrow #143

Open zktuong opened 6 days ago

zktuong commented 6 days ago

Report

HI @grst,

just wanted to flag to you that with a fresh installation of latest pandas/numpy in the environment, importing infercnvpy causes this to come back with AttributeError: _ARRAY_API not found similar to problems faced here https://github.com/spyder-ide/spyder/issues/22187

To reproduce:

mamba create -n testi "python=3.11"
mamba activate testi
pip install infercnvpy
python
import infercnvpy
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/infercnvpy/__init__.py", line 5, in <module>
    from . import datasets, io, pl, pp, tl
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/infercnvpy/datasets/__init__.py", line 5, in <module>
    import scanpy as sc
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/scanpy/__init__.py", line 20, in <module>
    from ._utils import check_versions
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/scanpy/_utils/__init__.py", line 27, in <module>
    from anndata import __version__ as anndata_version
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/anndata/__init__.py", line 24, in <module>
    from ._core.anndata import AnnData
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/anndata/_core/anndata.py", line 18, in <module>
    import pandas as pd
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/pandas/__init__.py", line 26, in <module>
    from pandas.compat import (
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/pandas/compat/__init__.py", line 27, in <module>
    from pandas.compat.pyarrow import (
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/pandas/compat/pyarrow.py", line 8, in <module>
    import pyarrow as pa
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/pyarrow/__init__.py", line 65, in <module>
    import pyarrow.lib as _lib
AttributeError: _ARRAY_API not found

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/infercnvpy/__init__.py", line 5, in <module>
    from . import datasets, io, pl, pp, tl
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/infercnvpy/datasets/__init__.py", line 5, in <module>
    import scanpy as sc
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/scanpy/__init__.py", line 20, in <module>
    from ._utils import check_versions
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/scanpy/_utils/__init__.py", line 27, in <module>
    from anndata import __version__ as anndata_version
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/anndata/__init__.py", line 24, in <module>
    from ._core.anndata import AnnData
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/anndata/_core/anndata.py", line 18, in <module>
    import pandas as pd
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/pandas/__init__.py", line 49, in <module>
    from pandas.core.api import (
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/pandas/core/api.py", line 9, in <module>
    from pandas.core.dtypes.dtypes import (
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/pandas/core/dtypes/dtypes.py", line 24, in <module>
    from pandas._libs import (
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/testi/lib/python3.11/site-packages/pyarrow/__init__.py", line 65, in <module>
    import pyarrow.lib as _lib
AttributeError: _ARRAY_API not found

I found that it's becausse gtfparse has a super strict pyarrow requirement Related to: https://github.com/openvax/gtfparse/issues/47

Forcing installation with pyarrow>=17.0.0 gets around this but i just have to deal with the pip dependency resolver error.

Version information

No response

grst commented 5 days ago

thanks for reporting... gtfparse again, I should try to get rid of that package somehow.

grst commented 5 days ago

I believe the best solution is to introduce alternative ways of retrieving the genomic positions, see https://github.com/icbi-lab/infercnvpy/issues/144