bgarlock / mod-spdy

Automatically exported from code.google.com/p/mod-spdy
0 stars 0 forks source link

Failed to start new worker thread #55

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What version/revision number of mod_spdy are you using?
mod-spdy-beta 0.9.3.2-r376

What version of Apache are you using, and on what operating system?  (Use
`apache2ctl -v` to check.)
Server version: Apache/2.2.16 (Debian)
Server built:   Apr  1 2012 06:40:08

What other Apache modules are you using?  (Use `apache2ctl -M` to check.)

Loaded Modules:
 core_module (static)
 log_config_module (static)
 logio_module (static)
 mpm_worker_module (static)
 http_module (static)
 so_module (static)
 alias_module (shared)
 auth_basic_module (shared)
 authn_file_module (shared)
 authz_host_module (shared)
 authz_user_module (shared)
 cgid_module (shared)
 deflate_module (shared)
 dir_module (shared)
 env_module (shared)
 headers_module (shared)
 include_module (shared)
 info_module (shared)
 mime_module (shared)
 negotiation_module (shared)
 proxy_module (shared)
 proxy_balancer_module (shared)
 proxy_http_module (shared)
 reqtimeout_module (shared)
 setenvif_module (shared)
 spdy_module (shared)
 ssl_module (shared)
 status_module (shared)
 unique_id_module (shared)

What browser version did you use to access the mod_spdy server?  On what
operating system? What flags was the browser invoked with?  (For
Chrome/Chromium, go to about:version to check.)

Chrome 21.0.1180.89 (Official Build 154005)
no special flags

Firefox/Iceweasel 15.0.1
confirmed with network.http.spdy.enabled.v3 enabled or not (eg: spdy v2 or v3)

What steps will reproduce the problem?
1. Load a "heavy" page (~50 uncached images) via spdy.
2. Wait.
3. Try it again.
4. Repeat.  Eventually, the heavish page will load in sequence rather than in 
parallel and the error message will be displayed.

What is the expected result? What do you see instead?

A nice parallel load of all the images on the page.

Please provide any additional information below.

I'm seeing something similar to issue 53, but not quite the same so I'll file 
them separately.

The architecture is as follows:
- 1 apache reverse proxy using mpm-worker and mod_spdy
- 1 (or more) apache backends using mpm-prefork and mod_php5 (debian squeeze, 
else I'd probably be using php-fpm).
- all connections to the backend go through the proxy (the backend apaches are 
running on unrouted ipv6 addresses)

We have a test page setup to load 50 images with some small sleep time for each 
of them (so we can see the difference).  This is basically our equivalent of 
the "flags page" test in the demo video.

Upon an initial proxy server restart, the page appears to load nice and speedy. 
 Images on the test page are loaded in a random order and in parallel.  
However, after some period of time, the images are no longer loaded in 
parallel.  Instead, one can basically see it loading one image after another.  
However, if you use a non-spdy enabled browser, the images are still loaded in 
parallel (due to multiple browser https connections).  There are plenty of 
resources (and config settings) available on the backends to handle more 
connections during this test, so they're not the problem.  While the spdy issue 
is going on the following messages are reported by the apache proxy:

[Tue Sep 25 16:16:07 2012] [info] [client fded:ffff::dead:beef] 
[mod_spdy/0.9.3.2-376] [20564:20603:INFO:mod_spdy.cc(510)] Starting SPDY/3 
session
[Tue Sep 25 16:16:08 2012] [error] [client fded:ffff::dead:beef] 
[mod_spdy/0.9.3.2-376] [20564:20603:ERROR:thread_pool.cc(375)] Failed to start 
new worker thread.
[Tue Sep 25 16:16:08 2012] [error] [client fded:ffff::dead:beef] 
[mod_spdy/0.9.3.2-376] [20564:20603:ERROR:thread_pool.cc(375)] Failed to start 
new worker thread.
(repeat)

If I crank up LogLevel to debug, I'll also see some of these, but not directly 
around the "Failed to start new worker thread" messages (hence kinda like Issue 
53):

[Tue Sep 25 15:55:23 2012] [info] [client fded:ffff::dead:beef] (9)Bad file 
descriptor: SSL input filter read failed.
[Tue Sep 25 15:55:23 2012] [error] [client fded:ffff::dead:beef] 
[mod_spdy/0.9.3.2-376] [32466:32482:ERROR:apache_spdy_session_io.cc(88)] 
ap_get_brigade failed with status 9: Bad file descriptor
[Tue Sep 25 15:55:23 2012] [info] [client fded:ffff::dead:beef] (9)Bad file 
descriptor: core_output_filter: writing data to the network
[Tue Sep 25 15:55:23 2012] [debug] mod_spdy/apache/log_message_handler.cc(106): 
[client fded:ffff::dead:beef] [mod_spdy/0.9.3.2-376] 
[32466:32482:VERBOSE2:apache_spdy_session_io.cc(177)] ap_pass_brigade returned 
ECONNABORTED
[Tue Sep 25 15:55:23 2012] [info] [client fded:ffff::dead:beef] 
[mod_spdy/0.9.3.2-376] [32466:32482:INFO:mod_spdy.cc(524)] Terminating SPDY/3 
session

Restarting the apache proxy restores the nice speedy load times under spdy for 
a bit, but then this happens again after some period.

My test requests are the only ones currently hitting the server.  There are 49 
idle worker threads according to /server-status on the proxy, though as noted 
in Issue 54, details on spdy seem to be missing from mod_status.

The backend has 22 idle workers.

I've seen this both with the 

SpdyMaxThreadsPerProcess 30

line in the conf commented (whatever the defaults currently are) or 
uncommented.  All other settings (aside from SpdyDebugLoggingVerbosity) are 
left as their defaults as distributed by the mod-spdy-beta deb (currently 
version 0.9.3.2-r376).

Let me know if you need any more details.

Thanks,
Brian

Original issue reported on code.google.com by bpkr...@gmail.com on 25 Sep 2012 at 9:44

GoogleCodeExporter commented 8 years ago
Eek.  I think I know what's going on here; fortunately, I think I know how to 
fix it.  Working on it now.

When you say "this happens again after some period", how long a period do you 
mean?  Minutes/hours/days?

Original comment by mdste...@google.com on 26 Sep 2012 at 12:56

GoogleCodeExporter commented 8 years ago
mod-spdy@googlecode.com <mod-spdy@googlecode.com> 2012-09-26 00:56:

Minutes to hours.  I haven't exactly narrowed down what triggers it, but 
just leaving a browser tab open and idle and then hitting refresh after 
some period, say 15 minutes, seems to do it fairly consistently.

Original comment by bpkr...@gmail.com on 26 Sep 2012 at 2:30

GoogleCodeExporter commented 8 years ago
Thanks again for the detailed report.  This is (hopefully) fixed in trunk as of 
r379, and this fix now appears in the latest 0.9.3.3 source release that was 
just made.  I'm hoping to get binaries out soon, but if you can't wait you can 
try building this new release from source using the instructions here: 
http://code.google.com/p/mod-spdy/wiki/GettingStarted

Of course, please do let me know if the problem persists with the new version!

Original comment by mdste...@google.com on 4 Oct 2012 at 5:11

GoogleCodeExporter commented 8 years ago
mod-spdy@googlecode.com <mod-spdy@googlecode.com> 2012-10-04 17:11:

Thanks.  I'm kinda swamped at the moment, so I'll probably just wait 
until the debs are released to test it.

Original comment by bpkr...@gmail.com on 5 Oct 2012 at 7:13