phusion / passenger

A fast and robust web server and application server for Ruby, Python and Node.js
https://www.phusionpassenger.com/
MIT License
5.01k stars 547 forks source link

Phusion Passenger not serving requests after a restart #2457

Open herrbuerger opened 1 year ago

herrbuerger commented 1 year ago

Issue report

We are running into an issue with Phusion Passenger (6.0.15/nginx) regularly, the timeline looks like this:

A Ruby on Rails application is deployed with Capistrano. At the end of the process the restart takes place:

* executing "sudo passenger-config restart-app --ignore-app-not-running /apps/production"
    servers: ["example.com"]
    [example.com] executing command
** [out :: example.com] Restarting /apps/production/current (production)

The deploy finishes successfully and the Passenger instances are restarted. Before this there were no messages in the error log. With the restart a few of the following messages show up in the error.log:

[ N 2022-11-24 16:22:50.8264 9559/Tw age/Cor/CoreMain.cpp:1147 ]: Checking whether to disconnect long-running connections for process 13852, application /apps/production/current (production)

The passenger-status page shows the following information during that time:

Ee4g8

The requests continue to pile up in the queue. One queue quickly reaches the 500 request limit and the error log starts showing entries like this:

[ W 2022-11-24 16:23:24.2389 9559/Tc age/Cor/Con/CheckoutSession.cpp:266 ]: [Client 3-9385] Returning HTTP 503 due to: Request queue full (configured max. size: 500)

A few seconds later the new processes have spawned. Now instead of serving new requests their Processed count stays at 0. We have stretched this out for 1 minute and we still don't see any requests being served by the new instances:

4kVZI

At this point the application is unresponsive and since this is a production environment we cannot stretch this out any further and restart nginx.

My questions

I tried to provide as much detail as possible. If something is missing (Passenger config, etc.) please let me know and I will extend this question with the necessary info.


Question 2: Passenger version and integration mode:

Question 3: OS or Linux distro, platform (including version):

Question 4: Passenger installation method:

Question 5: Your app's programming language (including any version managers) and framework (including versions):

Question 6: Are you using a PaaS and/or containerization? If so which one?


Disclaimer: I've asked this before on Stackoverflow and didn't get a response after 12 days.

CamJN commented 1 year ago

Assuming that you have "smart spawning" enabled: https://www.phusionpassenger.com/docs/advanced_guides/in_depth/ruby/spawn_methods.html then this looks expected. The first process is used to fork additional processes quickly instead of paying the startup time for every additional process. And yes when you restart an app in passenger OSS you get downtime until the new worker processes are up. Rolling restarts address this as you suspected. If you want the first process that comes up to start serving requests, then you should switch to direct spawning.