Open bemoody opened 1 month ago
I can confirm, but it seems working on main branch.
Thanks. But do you mean the bug does not occur with the main branch of numpy, or do you mean the bug does not occur with the main branch of pandas and version 1.26.4 of numpy?
As far as I've seen, this bug doesn't occur with the 2.x releases of numpy, only with the 1.x releases.
I tried doing this:
git clone https://github.com/pandas-dev/pandas
virtualenv v1
./v1/bin/pip install ./pandas
./v1/bin/pip install 'numpy<2'
And I also tried doing this:
virtualenv v2
./v2/bin/pip install --pre --extra-index https://pypi.anaconda.org/scientific-python-nightly-wheels/simple pandas
./v2/bin/pip install 'numpy<2'
Both installations exhibit the bug above.
If you don't see the bug, what platform/interpreter and what versions of pandas and numpy are you using?
It seems like this is an inconsistency in numpy. Looks like "weak promotion" in 2.x doesn't apply to comparisons, but "weak promotion" in 1.x does apply to comparisons?
numpy 2.x
>>> import numpy
>>> numpy._set_promotion_state('weak')
>>> numpy.int8(1) < 1000
np.True_
numpy 1.x
>>> import numpy
>>> numpy._set_promotion_state('weak')
>>> numpy.int8(1) < 1000
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python integer 1000 out of bounds for int8
In pandas, this causes an exception at either line 1457:
or (oINT64_MIN <= val < 0)
or line 2631:
val > oUINT64_MAX or val < oINT64_MIN):
for example:
>>> pandas._libs.lib.maybe_convert_objects(numpy.array([numpy.int64(1)], dtype=object))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "lib.pyx", line 2631, in pandas._libs.lib.maybe_convert_objects
OverflowError: Python int too large to convert to C long
>>> pandas._libs.lib.maybe_convert_objects(numpy.array([numpy.uint32(1)], dtype=object))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "lib.pyx", line 2628, in pandas._libs.lib.maybe_convert_objects
File "lib.pyx", line 1457, in pandas._libs.lib.Seen.saw_int
OverflowError: Python integer -9223372036854775808 out of bounds for uint32
I do not produce the bug on the main branch of pandas. My environment:
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
pip install numpy==1.26.4 pandas==2.2.3
Issue Description
If using numpy 1.26, and numpy is set to "weak" or "weak_and_warn" promotion mode (meant to be compatible with the behavior of numpy 2.x), this causes internal pandas functions to fail.
For example, the above command to print a trivial DataFrame results in:
This doesn't happen with numpy 1.26 in its default "legacy" mode. It doesn't happen with numpy 2.x in either "legacy" or "weak" mode.
More information about numpy 1.x versus 2.x and promotion modes is documented here: https://numpy.org/devdocs/numpy_2_0_migration_guide.html#changes-to-numpy-data-type-promotion
Expected Behavior
print(pandas.DataFrame({"x": [1]}))
should not crash. It should work properly regardless of the global numpy promotion setting.Installed Versions