apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.65k stars 3.55k forks source link

[Dev][Archery] archery integration missing numpy #44716

Closed amoeba closed 1 week ago

amoeba commented 1 week ago

Describe the bug, including details regarding any error messages, version, and platform.

The instructions for installing archery to run the integration tests indicates an appropriate version of archery can be installed with

pip install -e "dev/archery[integration]"

However, when I try to do this with a fresh checkout of arrow and a newly-created venv, I get an error about missing numpy:

$ gh repo clone apache/arrow
$ cd arrow
$ python3 -m venv .venv
$ source .venv/bin/activate.fish
.venv $ python -m pip install dev/archery[integration]
Processing ./dev/archery
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting click>=7 (from archery==0.1.0)
  Using cached click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Collecting cffi (from archery==0.1.0)
  Using cached cffi-1.17.1-cp313-cp313-macosx_11_0_arm64.whl.metadata (1.5 kB)
Collecting pycparser (from cffi->archery==0.1.0)
  Using cached pycparser-2.22-py3-none-any.whl.metadata (943 bytes)
Using cached click-8.1.7-py3-none-any.whl (97 kB)
Using cached cffi-1.17.1-cp313-cp313-macosx_11_0_arm64.whl (178 kB)
Using cached pycparser-2.22-py3-none-any.whl (117 kB)
Building wheels for collected packages: archery
  Building wheel for archery (pyproject.toml) ... done
  Created wheel for archery: filename=archery-0.1.0-py3-none-any.whl size=149209 sha256=c0e4b22ae01100a0deb82b08d45d8467d0e812d1c93f0c248b56fe371b455530
  Stored in directory: /private/var/folders/hr/w0c05_fn7x5b8nyslfp8tfsw0000gn/T/pip-ephem-wheel-cache-1_lv3wr2/wheels/16/73/6b/43f65db3f3be610e0348b4a725c166c41b7e5cd30c8c2e9f30
Successfully built archery
Installing collected packages: pycparser, click, cffi, archery
Successfully installed archery-0.1.0 cffi-1.17.1 click-8.1.7 pycparser-2.22

[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: pip install --upgrade pip

That all goes well but then I get an error when I try to run a test:

.venv $ archery integration --run-ipc --with-cpp=1
Traceback (most recent call last):
  File "/Users/bryce/Checkouts/arrow/.venv/bin/archery", line 8, in <module>
    sys.exit(archery())
             ~~~~~~~^^
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/archery/cli.py", line 833, in integration
    from .integration.runner import write_js_test_json, run_all_tests
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/archery/integration/runner.py", line 31, in <module>
    from . import cdata
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/archery/integration/cdata.py", line 24, in <module>
    from .tester import CDataExporter, CDataImporter
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/archery/integration/tester.py", line 25, in <module>
    from .util import log
  File "/Users/bryce/Checkouts/arrow/.venv/lib/python3.13/site-packages/archery/integration/util.py", line 27, in <module>
    import numpy as np
ModuleNotFoundError: No module named 'numpy'

Component(s)

Developer Tools, Integration

kou commented 1 week ago

Issue resolved by pull request 44717 https://github.com/apache/arrow/pull/44717