Closed adrianonobre closed 2 weeks ago
potentially caused by this: https://github.com/numpy/numpy/commit/b3f9fc0ffed50ac437a2f09ecffeb6709a2487c8
same issue
We are not planning on adding numpy 2 support to pandas 1.2.5 since numpy 2 compat is a pretty involved process and pandas 1.2.5 is already 3 years old at this point.
Please upgrade to pandas 2.2.2 as that is the first pandas version to support numpy 2.
Hi @lithomas1, I think there's a misunderstanding (I'll take numpy 2.0 mention from the title) I'm not trying to use numpy 2. The issue is rather that pandas 1.2.5 build no longer works (with numpy 1.26, see notes in the "installation logs" section).
I was just pointing out that a change done for numpy 2 seems to have bled into numpy 1.26 as a breaking change, potentially affecting projects that depend on numpy 1.26
Thanks for looking into this
Sorry for the misunderstanding.
Did you mention that this started failing after numpy 2.0 was released? (If so, this might because the build of pandas 2.0 is pulling the newest numpy, not the one you have installed)
No worries! (and thanks again for your time, @lithomas1 )
Correct. We noticed the pandas build started failing yesterday "out of the blue". We've got dependencies versions pinned in our requirements file. We didn't make any changes to these versions. Here's a couple of them: numpy==1.26.4 pandas==1.2.5
Investigating a bit we noticed that the NumPy project did a release release this week (2.0) and we found this change which seems to align with the error message we're getting in the pandas build (i.e. a missing struct attribute elsize
):
pandas/_libs/src/ujson/python/JSONtoObj.c:260:33: error: no member named 'elsize' in 'struct _PyArray_Descr'
npyarr->elsize = dtype->elsize;
~~~~~ ^
The easiest way a colleague found to repro this is as follows:
# make virtual env (make sure python is 3.11)
python -m venv ./sample-venv
# activate virtual env
. ./sample-venv/bin/activate
# install same packages we do in the project (relevant to pandas/numpy)
pip install --no-cache six==1.16.0
pip install --no-cache pytz==2024.1
pip install --no-cache python-dateutil==2.9.0.post0
pip install --no-cache numpy==1.26.4
pip install --no-cache cython==0.29.21
# blows up
pip install --no-cache pandas==1.2.5
Can you try installing pandas with --no-build-isolation
?
(You'll need all dependencies pre-installed, but this should force pip to use your numpy, and not pull its own numpy)
For what it's worth, this problem seems to be due to this line in pyproject.toml and can be fixed by changing it to "numpy<2; python_version>='3.9'"
Yeah, this is maybe something to consider in the future, but for now the --no-build-isolation
step should fix it.
Can you try installing pandas with --no-build-isolation
NEWEST EDIT:
Workaround: We were able to get it working by using "--no-build-isolation" + bumping a cython version to 0.29.37. So:
# make virtual env (make sure python is 3.11)
python -m venv ./sample-venv
# activate virtual env
. ./sample-venv/bin/activate
# install same packages we do in the project (relevant to pandas/numpy)
pip install --no-cache six==1.16.0
pip install --no-cache pytz==2024.1
pip install --no-cache python-dateutil==2.9.0.post0
pip install --no-cache numpy==1.26.4
pip install --no-cache cython==0.29.37 <--------------- IMPORTANT: version
pip install --no-cache pandas==1.2.5 --no-build-isolation <------- IMPORTANT: no-build-isolation flag
# WORKS NOW!
OLD:
I tried pip install pandas==1.2.5 --no-build-isolation
and got a different error:
(EDIT: fwiw I got Python 3.11 when I got the the error below, someone reported that it worked while they were using Python 3.9)
pandas/_libs/algos.c:235:12: fatal error: 'longintrepr.h' file not found
#include "longintrepr.h"
^~~~~~~~~~~~~~~
Can you try upgrading your Cython?
This looks like https://github.com/aio-libs/aiohttp/issues/6600, which someone reports was fixed in Cython 0.29.5
Thanks for your help @lithomas1 (and @mttr ) ! 🙏
Hi, @lithomas1 and @mttr,
I am facing the same issue - pandas build has started to fail since couple of days. Tried the solution mentioned above, however using the--no-build-isolation
is throwing me the error as ModuleNotFoundError: No module named 'numpy'
Providing some more context to our problem here: We are trying to build pandas 1.2.4 with python 3.11 Locked the versions as below
cython==0.29.37
numpy==1.26.4
pandas==1.2.4
Installing it inside a docker as pip install --no-build-isolation $REQUIREMENTS
where $REQUIREMENTS
is the path to the requirements.txt
file which contains the name of our internal package where we are using pandas and numpy.
Any help is much appreciated. Please advise.
Can you try a newer pandas?
pandas 1.2.4 does not officially support Python 3.11 IIRC, so if something non-trivial is going wrong, I can't help you too much (as the version of pandas you are using is very old).
pandas 1.5 should have official wheels for Python 3.11 (and also should be API compatible with pandas 1.2.4)
Installation check
Platform
macOS-14.5-arm64-arm-64bit
Installation Method
pip install
pandas Version
1.2.5
Python Version
Python 3.11.7
Installation Logs