NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
54 stars 37 forks source link

[FEA] explore if it is worth setting minPartitionSize and/or parallelismFirst in AutoTuner #288

Open revans2 opened 1 year ago

revans2 commented 1 year ago

Is your feature request related to a problem? Please describe. In Spark 3.2.0 and above AQE added the configs spark.sql.adaptive.coalescePartitions.minPartitionSize and spark.sql.adaptive.coalescePartitions.parallelismFirst. The these are used to adjust how combining of shuffle partitions in AQE works. Prior to this spark.sql.adaptive.coalescePartitions.minPartitionNum was used. We set the config for older versions of Spark to 1 as a part of https://github.com/NVIDIA/spark-rapids-tools/pull/285. We should look to see if we should also be setting these to non-default values.

Describe the solution you'd like Do some simple experiments to see if we should set parallelismFirst to false. This has the same effect as setting numPartitionNum to 1, which would allow for more efficiency when processing data. (1 big batch instead of lots of small batches)

We could also look at setting a minPartitionSize that is larger than the default of 1m. This again would not be for performance, but would be a balance for efficiency in cases when we have a very small amount of data. If we do set it we would need to be careful that it is not larger than the advisory size/5. It is not a big deal if it does go larger. It is just what Spark has hard coded and we probably don't want have to worry about setting it and not seeing any impact.

revans2 commented 1 year ago

I started to do some test with parallelismFirst set to false and the performance was noticeably worse than with it not set, on NDS 3k runs on dataproc. When using a default advisory size of 64m the difference was small < 1%, but when moving up to 128m advisory size the performance was slower by 3% to 7%.

This raises concerns about #285 and if we should set it to the parallelism of the cluster instead of 1 as the smallest value. Will need some more testing there.

tgravescs commented 3 months ago

Alessandro was doing some experiments with NDS and was seeing a nice performance improvement with spark.sql.adaptive.coalescePartitions.minPartitionSize set to 25MB. We may want to look at this and see if we can base it on some metrics