I want to adjust the number of workers online, Actual scenarios such as scaling read pool (controlled by unified.max-thread-count) of TiKV without restart. Expect to reduce performance jitter as much as possible during the adjustment process.
Technical proposal
Add a new member core_thread_count in SchedConfig, the naming of core borrows corePoolSize from ThreadPoolTaskExecutor.
In our scenario, max_thread_count represents the maximum scale value of the thread pool,
core_thread_count represents the number of actual participation in Remote<T> scheduling. If core_thread_count is not set during build, it will be equal to the actual set max_thread_count.
Add a new method runnable in Local<T>, it will call need_to_park_worker which is added in QueueCore<T> to To determine whether it can run, if not, call sleep which is added in Local<T> to park self with addr QueueCore<T> | 1, so it will not participate in Remote<T> scheduling
Add scale_workers in Remote<T>, When scaling up, the number of workers needed to unpark from the queue with addr QueueCore<T> | 1. When scaling down, unpark addr QueueCore<T>, so that threads in pop_and_sleep can also participate in recycling (by call runnable mentioned above)
Add scale_workers in ThreadPool<T> too, it just calls Remote<T>'s scale_workers.
Add a new member core_workers in QueueCore<T>, as the value of active_workers changes with the working status of workers, it cannot be reused. It is also an atomic variable and is kept in sync by using compare_exchange_weak. This member is used in scale_workers and Runnable.
Pros
The adjustment of the number of threads is very lightweight, and there are no thread creation and destruction operations and lock operations
Without scale, judging whether it is runnable is just comparing two atomic variables
Cons
During build, max_thread_count thread objects will be created at once, there will be a little more CPU overhead and memory overhead
Process
The feature has been implemented, under the stability test.
Feature Request
Describe the feature you'd like:
I want to adjust the number of workers online, Actual scenarios such as scaling
read pool
(controlled byunified.max-thread-count
) of TiKV without restart. Expect to reduce performance jitter as much as possible during the adjustment process.Technical proposal
core_thread_count
inSchedConfig
, the naming of core borrowscorePoolSize
from ThreadPoolTaskExecutor. In our scenario,max_thread_count
represents the maximum scale value of the thread pool,core_thread_count
represents the number of actual participation inRemote<T>
scheduling. Ifcore_thread_count
is not set duringbuild
, it will be equal to the actual setmax_thread_count
.runnable
inLocal<T>
, it will callneed_to_park_worker
which is added inQueueCore<T>
to To determine whether it can run, if not, callsleep
which is added inLocal<T>
to park self with addrQueueCore<T> | 1
, so it will not participate inRemote<T>
schedulingscale_workers
inRemote<T>
, When scaling up, the number of workers needed to unpark from the queue with addrQueueCore<T> | 1
. When scaling down, unpark addrQueueCore<T>
, so that threads inpop_and_sleep
can also participate in recycling (by callrunnable
mentioned above)scale_workers
inThreadPool<T>
too, it just callsRemote<T>
'sscale_workers
.core_workers
inQueueCore<T>
, as the value ofactive_workers
changes with the working status of workers, it cannot be reused. It is also an atomic variable and is kept in sync by usingcompare_exchange_weak
. This member is used inscale_workers
andRunnable
.Pros
Cons
build
,max_thread_count
thread objects will be created at once, there will be a little more CPU overhead and memory overheadProcess