OSC / ondemand-packaging

Used to build RPMs for OnDemand and OSC-specific BatchConnect applications
https://osc.github.io/Open-OnDemand/
2 stars 5 forks source link

Passenger creates an unnecessary amount of threads on systems with many CPUs #275

Open CSC-swesters opened 3 weeks ago

CSC-swesters commented 3 weeks ago

Passenger will spawn worker threads in each per-user-nginx (PUN) instance based on the amount of cores on the system where OOD runs. This behavior probably makes sense when running only a single web server instance. However, when OOD is launching one Nginx instance per user, and an OOD deployment has hundreds of users, there starts to be a lot of threads.

There are a few issues with having this many threads. Passenger will monitor the CPU and memory usage of all these threads by running ps(1) regularly, and this creates a high system load on the OOD host. We have seen load averages spike into the thousands on a regular basis, which we suspect Passenger is to blame for. Using the perf(1) tool at the tail end of such a load spike, we could see that PassengerAgent is the culprit behind a lot of context switches.

[root@host ~]# perf record -e context-switches -c 1 -a
^C

[root@host ~]# perf report
Samples: 409K of event 'context-switches', Event count (approx.): 409815
Overhead  Command          Shared Object      Symbol
  26,68%  PassengerAgent   [kernel.kallsyms]  [k] schedule
...

For context, one of our clusters runs OOD on a 12 core VM, with several hundreds of users. At the point of writing, there were 722 PUNs, 1461 PassengerAgent processes, and 26754 PassengerAgent threads.

[root@host ~]# ps aux | grep "nginx" | grep -c "master"
722
[root@host ~]# pstree -np > pstree.txt
[root@host ~]# grep -cF -e "-PassengerAgent(" pstree.txt
1461
[root@host ~]# grep -cF -e "-{PassengerAgent}(" pstree.txt
26754

We investigated the issue and found that Passenger will ultimately use the Boost thread::hardware_concurrency() function (link to source code) to decide what amount of threads to spawn for its agent.

It's possible that Passenger has some knobs for tuning its parallelism, but we suspect they are under the Passenger Enterprise offering. Luckily, the Passenger source code is MIT licensed, so we are able to patch it to better suit OOD's architecture.

We were able to patch this via the ondemand-passenger rpmbuild spec, to accept an environment variable which overrides the default behavior when defined and containing a valid, positive integer. If you are interested in upstreaming this patch, we will gladly open a pull request.

With this patch, and the override set to 2 (instead of 12), we're reducing the amount of threads per PUN by 20. The benchmark we used was to restart the instance, and then log into the dashboard. The pstree -u username outputs are attached below:

# Before (12 threads)
PassengerAgent─┬─PassengerAgent─┬─ruby───2*[{ruby}]
               │                └─33*[{PassengerAgent}]
               └─4*[{PassengerAgent}]
# After (override as 2 threads)
PassengerAgent─┬─PassengerAgent─┬─ruby───2*[{ruby}]
               │                └─13*[{PassengerAgent}]
               └─4*[{PassengerAgent}]

On systems with even more CPU threads available, the improvement is naturally even greater. Running OOD on a host with 256 hardware threads will result in a pstree that looks like this:

# 256 threads
PassengerAgent─┬─PassengerAgent─┬─ruby───2*[{ruby}]
               │                └─525*[{PassengerAgent}]
               └─4*[{PassengerAgent}]

I.e., the amount of reduced threads per PUN could be 508 with this patch and override to 2 threads hardware_concurrency :slightly_smiling_face:

Let us know what you think!

CSC-swesters commented 2 weeks ago

After even more digging around, we actually found that there's an option in Passenger for doing this, even though it's not particularly well documented. In Passenger's Config.h header it's documented as "read_only".

However, by setting the following in the nginx_stage.yml we could achieve the same effect without patching anything:

---
# nginx_stage

passenger_options:
  passenger_ctl: "controller_threads 1"

On our system with 12 CPUs, we get this result after logging in:

PassengerAgent─┬─PassengerAgent─┬─ruby───2*[{ruby}]
               │                └─10*[{PassengerAgent}]
               └─4*[{PassengerAgent}]

So this issue can be closed without packaging changes. We would still like to hear if you've encountered this issue, or if we are the first to voice this concern.

Since this configuration concerns the nginx_stage.yml template over in the main ondemand repository, we might propose some changes there. For example, we found that it's a bit cumbersome to add multiple passenger_ctl options via the current template. One would, e.g., need to do this type of trick to inject multiple lines of passenger_ctl:

---
# Need to use a multi-line string to inject more than one passenger_ctl option:
passenger_options:
  passenger_ctl: 'hook_after_watchdog_shutdown /opt/ood/scripts/cleanup.sh;
  passenger_ctl controller_threads 1'
johrstrom commented 2 weeks ago

Hey sorry for the delay.

So this issue can be closed without packaging changes. We would still like to hear if you've encountered this issue, or if we are the first to voice this concern.

Yes I think you're the first to bring this up.

I'm not against patching ondemand-passenger, but you seem to have found a work around. Even so, I'd like to keep the ticket open because it would seem that something needs to be done. Seems like passenger_ctl needs to be a real option that can accept an array? I'll have to dig into passenger_ctl more to see what all it can do.

CSC-swesters commented 2 weeks ago

Yes I think you're the first to bring this up.

Interesting. I guess others may not have put as many CPUs under OOD as we have, or they have just silently accepted this behavior, or fixed it on their own.

I'm not against patching ondemand-passenger, but you seem to have found a work around.

That's good to hear. Yes, I believe the workaround is more elegant than patching Passenger in the ondemand-packaging build step for this particular need. So let's not pursue that at this stage.

Seems like passenger_ctl needs to be a real option that can accept an array?

That might be a suitable solution, yes. We've verified that the multi-line string trick that I commented above results in a working configuration, but it would be more elegant to have something like:

# Example configuration:
---
passenger_ctl:
  - 'hook_after_watchdog_shutdown /opt/ood/scripts/cleanup.sh;'
  - 'controller_threads 1'

That would probably go into the pun.conf.erb template I assume? Similar to how certain Passenger options already are defined separately in the nginx_stage.yml configuration: https://github.com/OSC/ondemand/blob/602f798237a2f1cd2105ef815d9c903b5c2df5a4/nginx_stage/templates/pun.conf.erb#L61-L66

johrstrom commented 2 weeks ago

That would probably go into the pun.conf.erb template I assume?

I assume so, but I'm not sure if it'll accept multiple values for the same key. Again, having never seen passenger_ctl I'm not sure how to use it properly.

CSC-swesters commented 2 weeks ago

I'm not sure if it'll accept multiple values for the same key.

If you're referring to adding multiple entries of passenger_ctl: "some value" into the passenger_options hash, we've tested that, and it doesn't work. It will only pick the last one.

I think we might need to add a separate nginx_stage configuration option for it, so that it can be added separately. E.g., this might work the way we'd like it to:

<%- passenger_ctl_options.to_a.each do |value| -%>
  passenger_ctl "<%= value -%>";
<%- end -%>

with the inputs:

---
passenger_ctl_options:
  - 'hook_after_watchdog_shutdown /opt/ood/scripts/cleanup.sh;'
  - 'controller_threads 1'

passenger_options:
  # separate options here ...
CSC-swesters commented 2 weeks ago

In our opinion, it might be reasonable to consider overriding the controller_threads with a low value by default (1 seems okay, but different sites might want to test and report in) in Open OnDemand, due to the special architecture it has. It's probably not a common use case to have many PassengerAgents running on the same web server, so the Passenger default will not be suitable for running per-user-nginx instances on larger systems. It will of course be configurable, for those who wish to tune it.

At least we'd suggest documenting that OOD recommends setting this to something low, e.g., in your nginx_stage.yml documentation, if you do not want to alter the defaults.