awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
635 stars 299 forks source link

Spark 3.3.0 is not preserving sort order when spills occur. Update to 3.3.2+ #197

Open AngeloCa opened 10 months ago

AngeloCa commented 10 months ago

As explained in this post. The sorting order might not be preserved within the partition when a spill occurs.

In order to avoid extra transformation steps or increase RAM instances it would be nice to have spark updates to fix this issue.