prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.74k stars 5.28k forks source link

Allow running Driver threads with SCHED_BATCH policy on Linux #23053

Closed arhimondr closed 4 days ago

arhimondr commented 1 week ago

Description

Use SCHED_BATCH scheduling policy for Driver thread pool

Motivation and Context

With blocking IO used by connectors (such as Hive connector) often it is necessary to set the number of threads to the number higher than the number of available cores.

This is needed to avoid slowdowns for IO heavy workload.

However for CPU intensive queries it may create unnecessary thread contention and may cause stability problems when communication threads are delayed.

Threads scheduled with SCHED_BATCH policy are run with slightly lower priority giving a green light for communication threads. These can run for longer resulting in less cache flushes if only other batch threads are waiting for execution.

More information about different scheduling policies in Linux can be found here: https://man7.org/linux/man-pages/man7/sched.7.html

Impact

Improves efficiency and stability for certain cluster configurations

Test Plan

Result:

chrt -p 3238430
pid 3238430's current scheduling policy: SCHED_BATCH
pid 3238430's current scheduling priority: 0

Contributor checklist

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== NO RELEASE NOTE ==
arhimondr commented 4 days ago

Thanks for the review @xiaoxmeng , updated