jjethwa / rundeck

GNU General Public License v3.0
123 stars 136 forks source link

Rundeck increasing threadpool is not working and it always runs 8 jobs in parallel #212

Closed santhoshvly closed 6 days ago

santhoshvly commented 2 weeks ago

Hi,

We are using rundeck 4.17.3 version in Kubernetes cluster. Recently we tried to increase the threadPool count to 100 to run more jobs in parallel. But it was not really doing anything and always running 8 jobs in parallel and all other jobs are going to some queue. It runs other pending jobs after the completion of 8 running jobs. Rundeck pod is running on a big machine, r5.x4large and it has 16 cores and 128GB memory. I can see the threadPool value 100 in rundeck UI in the System reports. Following are the configurations we are passing to container:-

RDECK_JVM_SETTINGS=-Dserver.web.context=/rundeck -Drundeck.jetty.connector.forwarded=true -Xms2048m -Xmx4096m RUNDECK_THREAD_COUNT=100

Screenshot 2024-07-03 at 1
57 28 PM

santhoshvly commented 2 weeks ago

We tried to increase the threadPool size to 20, 50 and 100, but it always runs 8 jobs in parallel, so looks like some other config or limit is there in the rundeck and doesn't look like a hardware limit. Could you please advise if we are doing anything wrong here.

jjethwa commented 2 weeks ago

Hi @santhoshvly

The thread config was taken from: https://docs.rundeck.com/docs/administration/maintenance/tuning-rundeck.html#quartz-job-threadcount and from the latest Quartz docs: https://www.quartz-scheduler.org/documentation/2.3.1-SNAPSHOT/configuration.html#configuration-of-threadpool-tune-resources-for-job-execution

The specific config file setting is done here: https://github.com/jjethwa/rundeck/blob/master/content/opt/run#L253-L260

From your attached screenshot (thank you!) it looks like the config has taken properly.

Do you have the thread count configured in your jobs? See: https://docs.rundeck.com/docs/learning/getting-started/jobs/workflow-strategies.html#parallel

santhoshvly commented 1 week ago

Hi @jjethwa

Thank you so much for sharing the details.

Yea, it is showing the configured threadPool size and looks like it is correctly configuring the value.

We are not configuring threadCount for the Jobs. In our case, only one node is there and the workflow strategy is node-first. We need to run all the steps of a job in sequence in the same node locally. For example, if I am triggering 10 jobs simultaneously using API:- Job1 - step1 --> step2 --> step3 (step 1, 2 and 3 should run sequentially on the single node available) Job2- step1 --> step2 --> step3 Job3 - step1 --> step2 --> step3 Job4- step1 --> step2 --> step3 Job5 - step1 --> step2 --> step3 Job6- step1 --> step2 --> step3 Job7 - step1 --> step2 --> step3 Job8 - step1 --> step2 --> step3 Job9 - step1 --> step2 --> step3 Job10 - step1 --> step2 --> step3 Then it is running step of 8 jobs (step1 first and then step2 and so on) simultaneously but steps of the remaining 2 jobs are never running , it is stuck in some sort of queue and will run only after any 2 of the 8 running job completes. So, at any given time, rundeck is able to schedule and run the step of 8 jobs and not more than that. Our expectation is to run 10 jobs simultaneously (step of all 10 jobs should be running if resources are available) but rundeck is only scheduling 8 jobs regardless of thread pool size configuration or instance type with more CPU/memory.

Screenshot 2024-07-03 at 9 01 51 PM

Is there any limit of concurrent steps in a single node rundeck setup or open source version?. We are expecting that we can run any number of steps concurrently on a single node if there are enough resources. So, if I tigger 10 jobs simultaneously, step1 of all the jobs should be starting at same time. Please let me know if you need more details of our rundeck step.

jjethwa commented 1 week ago

Hi @santhoshvly

Thanks for the detailed response. The job should be executing per your expectation and using more threads. There are no limits for the OSS version as far as I know. You might need to open an issue on the upstream project: https://github.com/rundeck/rundeck/issues

I'm not sure if they will still support 4.17.3 though, is it possible for you to try upgrading a cloned instance perhaps? Here are the upgrade docs: https://docs.rundeck.com/docs/upgrading/

You could also export the projects and import them into a new instance to test.

santhoshvly commented 1 week ago

Sure. Thank you so much for sharing the details!!

jjethwa commented 1 week ago

No problem, @santhoshvly

Keep me updated

santhoshvly commented 6 days ago

@jjethwa Sure. This issue was due to one of our service calling the Rundeck for job submission and not related to Rundeck Quartz thread pool config. After fixing that issue, Rundeck executed more jobs based on the threadpool config. So, everything looks good with the Rundeck docker container and thread config. Thank you for the support..

jjethwa commented 6 days ago

That's great news, @santhoshvly

I'll close this issue out then