Use SCHED_BATCH scheduling policy for Driver thread pool
Motivation and Context
With blocking IO used by connectors (such as Hive connector) often it is necessary to set the number of threads to the number higher than the number of available cores.
This is needed to avoid slowdowns for IO heavy workload.
However for CPU intensive queries it may create unnecessary thread contention and may cause stability problems when communication threads are delayed.
Threads scheduled with SCHED_BATCH policy are run with slightly lower priority giving a green light for communication threads. These can run for longer resulting in less cache flushes if only other batch threads are waiting for execution.
Description
Use SCHED_BATCH scheduling policy for Driver thread pool
Motivation and Context
With blocking IO used by connectors (such as Hive connector) often it is necessary to set the number of threads to the number higher than the number of available cores.
This is needed to avoid slowdowns for IO heavy workload.
However for CPU intensive queries it may create unnecessary thread contention and may cause stability problems when communication threads are delayed.
Threads scheduled with SCHED_BATCH policy are run with slightly lower priority giving a green light for communication threads. These can run for longer resulting in less cache flushes if only other batch threads are waiting for execution.
More information about different scheduling policies in Linux can be found here: https://man7.org/linux/man-pages/man7/sched.7.html
Impact
Improves efficiency and stability for certain cluster configurations
Test Plan
driver.threads-batch-scheduling-enabled=true
Driver
threadchrt -p <driver thread id>
Result:
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.