jenkinsci / concurrent-step-plugin

Jenkins plugin to use utils in Java concurrent package.
https://plugins.jenkins.io/concurrent-step/
MIT License
21 stars 8 forks source link

Thread usage in AcquireStep #7

Closed rschuetz closed 4 years ago

rschuetz commented 4 years ago

This is mainly about the Semaphore code of the project, I haven't tested the other tasks: The plug-in is using the common ForkJoinPool for background jobs, however the size of the pool is limited (by default) to approx. the number of CPUs , which leads (in case one has a more threads waiting for semaphores than CPU) to deadlocks or low throughput (if all threads are waiting to acquire semaphores, but no-one releases one resp. if most threads are waiting but just a few are free to trigger the actual Pipeline steps - see my PR). The limitation also leads to situations where a job acquiring semaphore A affects are totally different job acquiring semaphore B, just because there are no threads left actually waiting for the semaphore.

The limitation of the ForkJoinPool is fine for CPU-bound tasks, but the tasks executed by the plug-in are not CPU-bound, they're just waiting for semaphores resp. for the step to finish.

One option to solve this is to increase the size of the ForkJoinPool or use a dedicated pool per Jenkins job. Another option would be the following:

Do not acquire semaphores in dedicated Futures, but try to use a single thread that more or less polls a single or multiple semaphores each few hundred ms and launches (on success) dedicated executor threads (not bound to a pool or at least bound to a pool sufficiently sized) triggering the Pipeline steps. This will limit the number of threads waiting for semaphores to one and makes sure acquire steps always execute. The first part could be achieved by a custom thread, that simply takes requests (consisting out of semaphore, count, optional timeout and Runnable to start if successful) via a queue, iterates over the queue every few hundred ms, tries to acquire a semaphore with a zero or very short timeout and launches the task, or by scheduled CompletableFutures, that do the same but reschedule themselves to be executed again in a few hundred ms if needed. Concerning the second part I'd avoid in any case running the body invoker in the common ForkJoinPool to make sure they can always be executed and can release the semaphore again, no matter whether the pool is exhausted or not - I'd either use dedicated threads, a custom pool (if possible resizeable without the need to restart Jenkins) or check whether there is a better way to run the body invokers asynchronously without wasting a thread waiting here.

topikachu commented 4 years ago

fixed #6 and #9