aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
https://aws-sdk-pandas.readthedocs.io
Apache License 2.0
3.94k stars 701 forks source link

Install Guide of AWS Glue PySpark Jobs Causes unexpected keyword argument Error #2968

Closed wuyujp closed 1 month ago

wuyujp commented 2 months ago

Is your feature request related to a problem? Please describe. The install guide of AWS Glue PySpark Jobs will cause below error.

File "/home/spark/.local/lib/python3.10/site-packages/pyarrow/parquet.py", line 670, in __init__
    self.writer = _parquet.ParquetWriter(
  File "pyarrow/_parquet.pyx", line 1430, in pyarrow._parquet.ParquetWriter.__cinit__
TypeError: __cinit__() got an unexpected keyword argument 'encryption_properties'

Describe the solution you'd like Modify the install guide and do NOT specify the pyarrow version. Before: pyarrow==7,awswrangler After :pyarrow,awswrangler

jaidisido commented 1 month ago

Good catch, thanks! Addressed in https://github.com/aws/aws-sdk-pandas/commit/23662a9d386a55dd1ba3ca3598db7c61aeee8778