aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
https://aws-sdk-pandas.readthedocs.io
Apache License 2.0
3.9k stars 693 forks source link

New version of NUMPY(2.0.0) got released yesterday and Glue jobs are failing in python 3.9 #2858

Closed rsingh-821 closed 3 months ago

rsingh-821 commented 3 months ago

Describe the bug

Getting this error - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

How to Reproduce

In Glue 3.0, python shell job in python 3.9

Expected behavior

No response

Your project

No response

Screenshots

No response

OS

Linux

Python version

3.9

AWS SDK for pandas version

awswrangler 3.8.0

Additional context

last successful and failed run loaded these libraries correct - Successfully installed boto3-1.34.127 botocore-1.34.127 etlutils-0.1.0 jmespath-1.0.1 numpy-1.26.4 pandas-1.5.3 python-dateutil-2.9.0.post0 pytz-2024.1 s3transfer-0.10.1 six-1.16.0 tenacity-6.0.0 urllib3-1.26.18 Failed - Successfully installed boto3-1.34.127 botocore-1.34.127 etlutils-0.1.0 jmespath-1.0.1 numpy-2.0.0 pandas-1.5.3 python-dateutil-2.9.0.post0 pytz-2024.1 s3transfer-0.10.1 six-1.16.0 tenacity-6.0.0 urllib3-1.26.18

jaidisido commented 3 months ago

numpy is capped to >=1.x, <2.0 in awswrangler, meaning version 2.0 and above of numpy is not currently supported. numpy 2.0 was only installed in your job because you explicitly specified that version, otherwise it would have resolved to 1.x

RubTalha commented 2 months ago

https://stackoverflow.com/questions/78650222/valueerror-numpy-dtype-size-changed-may-indicate-binary-incompatibility-expec