Requests for Additional Argument `name_function` on `awswrangler.s3.to_parquet()`

aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

https://aws-sdk-pandas.readthedocs.io

Apache License 2.0

3.94k stars 702 forks source link

Requests for Additional Argument `name_function` on `awswrangler.s3.to_parquet()` #2800

Closed JaeryunYim closed 3 hours ago

JaeryunYim commented 7 months ago

Describe the solution you'd like I hope that awswrangler.s3.to_parquet() method has an argument similar to name_function of dask.dataframe.to_parquet() for the case of dataset=True and partition_cols is set. For me, the existing argument filename_prefix is not enough because each partitioned filename always contains some hash value which I don't want. https://docs.dask.org/en/stable/generated/dask.dataframe.to_parquet.html

kukushking commented 6 months ago

Thanks for opening this @JaeryunYim. Added to the backlog.

github-actions[bot] commented 3 months ago

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 7 days it will automatically be closed.

github-actions[bot] commented 1 week ago

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 7 days it will automatically be closed.