phusion / passenger

A fast and robust web server and application server for Ruby, Python and Node.js
https://www.phusionpassenger.com/
MIT License
5.01k stars 547 forks source link

IOError: stream closed in another thread #2499

Open bantonj opened 1 year ago

bantonj commented 1 year ago

Issue report

Question 1: What is the problem? After upgrading to version 6.0.18 from 6.0.15 we started to see this exception in our logs IOError: stream closed in another thread. The backtrace is:

/gems/passenger-6.0.18/src/ruby_supportlib/phusion_passenger/request_handler/thread_handler.rb:133 in accept /gems/passenger-6.0.18/src/ruby_supportlib/phusion_passenger/request_handler/thread_handler.rb:133 in accept_and_process_next_request /gems/passenger-6.0.18/src/ruby_supportlib/phusion_passenger/request_handler/thread_handler.rb:110 in main_loop /gems/passenger-6.0.18/src/ruby_supportlib/phusion_passenger/request_handler.rb:434 in block (2 levels) in start_threads /gems/passenger-6.0.18/src/ruby_supportlib/phusion_passenger/utils.rb:113 in block in create_thread_and_abort_on_exception

Question 2: Passenger version and integration mode: open source 6.0.18 apache

Question 3: OS or Linux distro, platform (including version): Ubuntu 20.04 on arm

Question 4: Passenger installation method:

Your answer: [x] RubyGems + Gemfile [ ] RubyGems, no Gemfile [ ] Phusion APT repo [ ] Phusion YUM repo [ ] OS X Homebrew [ ] source tarball [ ] Other, please specify:

Plus passenger-install-apache2-module --apxs2-path='/usr/bin/apxs' and passenger-config build-native-support

Question 5: Your app's programming language (including any version managers) and framework (including versions): Ruby 2.7.7

Question 6: Are you using a PaaS and/or containerization? If so which one? Docker based on ubuntu:20.04

CamJN commented 1 year ago

I wonder if this is related: https://github.com/phusion/passenger/commit/213c9a0669ada9defd045151feaf96fe6027de59

Would you mind editing src/ruby_supportlib/phusion_passenger/utils.rb and commenting out PhusionPassenger.call_event(:unhandled_exception_before_exit, e) and see if the error goes away?

toncid commented 8 months ago

@bantonj Any new findings that you might share?

@CamJN We are also getting the same error in cases when the application is under heavier load and then being restarted following a restart.txt touch.

As if there is a race condition between application unloading and web server accepting the next request.

Worker processes remain in this status indefinitely:

  * PID: 550179   Sessions: 1       Processed: 312     Uptime: 37m 42s
    CPU: 21%     Memory  : 206M    Last used: 32m 44s ag
    Shutting down...

Setup: Passenger Standalone v6.0.20, Amazon Linux 2023, Ruby 3.2.2, x86_64.

I have tried commenting the line in utils.rb (line 118), but it is after line 113 highlighted in the stack trace:

App 517567 output: [ 2024-03-14 10:13:08.5795 517567/0x00007ff93bc369a0(HTTP helper worker) utils.rb ]: *** Exception IOError (stream closed in another thread) (process 517567, thread 0x00007ff93bc369a0(HTTP helper worker)):
.../passenger-6.0.20/src/ruby_supportlib/phusion_passenger/request_handler/thread_handler.rb:133:in `accept'
.../passenger-6.0.20/src/ruby_supportlib/phusion_passenger/request_handler/thread_handler.rb:133:in `accept_and_process_next_request'
.../passenger-6.0.20/src/ruby_supportlib/phusion_passenger/request_handler/thread_handler.rb:110:in `main_loop'
.../passenger-6.0.20/src/ruby_supportlib/phusion_passenger/request_handler.rb:434:in `block (2 levels) in start_threads'
.../passenger-6.0.20/src/ruby_supportlib/phusion_passenger/utils.rb:113:in `block in create_thread_and_abort_on_exception'  <<<<<<<<<
.../newrelic_rpm-9.7.1/lib/new_relic/agent/tracer.rb:434:in `block (2 levels) in thread_block_with_current_transaction'
.../newrelic_rpm-9.7.1/lib/new_relic/agent/tracer.rb:357:in `capture_segment_error'
.../newrelic_rpm-9.7.1/lib/new_relic/agent/tracer.rb:433:in `block in thread_block_with_current_transaction'
CamJN commented 6 months ago

I can reliably reproduce this with ruby 3.2.4 on macOS just by running rake test:ruby. @FooBarWidget any idea what this is?

FooBarWidget commented 6 months ago

When the app process starts, it creates a server socket and multiple threads. Each thread accepts connections from that single server socket. The main thread then waits until the Passenger Core tells us to shut down.

Upon receiving the shutdown signal, the main thread first terminate all threads by calling Thread#raise followed Thread#join. After that, it closes the server socket.

So I don't really know why one would get a "stream closed in another thread" error from one of the threads. By the time we close the server sockets, the threads should already have terminated.

I can't reproduce the issue. Maybe someone who can, can help by increasing log level to 7.