cherokee / webserver

Cherokee Web Server
GNU General Public License v2.0
562 stars 105 forks source link

504 Gateway error #502

Open danielniccoli opened 11 years ago

danielniccoli commented 11 years ago

Original author: plundis@areaindex.com (June 23, 2009 16:37:25)

What steps will reproduce the problem?

  1. Run Cherokee in reverse proxy mode
  2. Randomly see 504 Gateway error even though backend server seems fine and responsive.

What is the expected output? What do you see instead? Expect to see content from the backend server behind Cherokee.

What version of the product are you using? On what operating system? .99-19 Ubuntu 8.04

Please provide any additional information below. Few items to improve with the 504 Gateway error:

  1. Are we retrying the processing of the request more than once before throwing this error?
  2. Can we provide a user setting to provide a custom HTML screen should such error occur?

Original issue: http://code.google.com/p/cherokee/issues/detail?id=504

danielniccoli commented 11 years ago

From sciyoshi on July 17, 2009 03:40:50 I can confirm this behavior with Cherokee 0.99.20-1 in reverse proxy mode for a Django backend.

danielniccoli commented 11 years ago

From alobbs on September 14, 2009 13:29:39 I've been investigating this issue for a while.

So far, it's been quite tough to reproduce. Do you guys know some way to get the server to fail and return those 50x errors?

The change-set 3661 could help , although I couldn't confirm it quite yet:

http://svn.cherokee-project.com/changeset/3661

Any feedback on this issue is more than welcome.

danielniccoli commented 11 years ago

From jja...@gmail.com on September 22, 2009 00:07:03 I can confirm the bug with cherokee 0.99.22 without modifications, running under Opensolaris 2009.06 and with several backends: drupal, dokuwiki, cgi and static. The website is under test and it has no load at all.

I will test latest changes and report here.

danielniccoli commented 11 years ago

From windspi...@gmail.com on October 11, 2009 16:05:17 Another confirmation I have 50x errors in

cherokee-0.99.22 + gentoo + fcgi(via 127.0.0.1:9000) + php + apc

and I dont know how to reproduce it.

danielniccoli commented 11 years ago

From ste...@konink.de on October 11, 2009 16:07:18 So why don't we all start to upgrade to at least the latest stable version and try again?

danielniccoli commented 11 years ago

From windspi...@gmail.com on October 14, 2009 03:41:53 unfortunately, gentoo's portage was a little bit outdated and there was only 0.99.22

but after I created new ebuild and installed cherokee 0.99.24, it got much better!

I noticed only 1 gateway-timeout error during a whole day of heavy usage =) and i belive it was not a cherokee fault that time.

danielniccoli commented 11 years ago

From no.and...@gmail.com on October 27, 2009 17:51:14 Guys, the 504 error really annoys me. And it's not backend problem. Debian 5, cherokee 0.99.22.

danielniccoli commented 11 years ago

From ste...@konink.de on October 27, 2009 17:54:48 I guess Comment 6 still applies ;)

danielniccoli commented 11 years ago

From alobbs on October 28, 2009 00:49:48 It seems there's people who have hit the issue with Cherokee 0.99.25. Looks like we haven't caught the bug yet.

danielniccoli commented 11 years ago

From skar...@gmail.com on October 28, 2009 11:40:16 Mmmmm... I'm getting 504 errors after 0.99.25 update... But I'm not using proxy, only one Information Source for PHP.

I didn't remember what trunk revision I was using before... argh! :-(

danielniccoli commented 11 years ago

From alobbs on October 28, 2009 13:07:22 Antonio, have you found a consistent way to reproduce the issue? Yesterday I spent a whole lot of time trying to reproduce the issue with no luck.

danielniccoli commented 11 years ago

From pubcrawl...@gmail.com on October 28, 2009 13:16:19 Has anyone considered building an entirely new configuration file from scratch to repeat your existing config?

It's worth trying. I think it might have something to do with the random errors somehow- just unsure how.

Anyone want to test this theory on a Cherokee install that is failing?

danielniccoli commented 11 years ago

From skar...@gmail.com on October 28, 2009 13:20:26 Álvaro: no, I updated Cherokee on my production server from an unknown trunk revision to HEAD, and after that I'm getting aleatory 504 errors.

danielniccoli commented 11 years ago

From tah...@gmail.com on October 28, 2009 13:27:30 My issues with this matter are like Skarcha's. No proxy involved.

100% information source related, PHP in this case. Manually killing the idle php-cgi instances immediately restores responsiveness.

danielniccoli commented 11 years ago

From lnu...@gmail.com on October 28, 2009 15:07:07 php-cgi + 0.99.25-1~hardy~ppa1 I have 1 php information source and no problems

danielniccoli commented 11 years ago

From alobbs on October 28, 2009 16:59:34 Let's see whether this patch helps us to locate the issue. I haven't managed to reproduce it more than 2 or 3 random times..

danielniccoli commented 11 years ago

From skar...@gmail.com on October 29, 2009 08:12:24 Patch applied...

danielniccoli commented 11 years ago

From skar...@gmail.com on October 29, 2009 23:30:23 I just commited a patch (r3758: http://svn.cherokee-project.com/changeset/3758) to fix (yes, again... :( ) POST issues.

My previous patches (r3655 and r3657) were bad... sorry guys! I don't know how my tests works before I commited them.

Maybe it doesn't help with this bug, but some PHP software (like rgtui) works without 504 errors now.

Anyone could give it a try?

danielniccoli commented 11 years ago

From alobbs on October 30, 2009 16:16:39 Just uploaded a related QA test:

http://svn.cherokee-project.com/browser/cherokee/trunk/qa/231-POST-4extra.py

danielniccoli commented 11 years ago

From mict...@gmail.com on November 18, 2009 23:03:12 I don't know if this is the same issue, but I have laaaaarge php scripts, I have this:

Linux arq-mpacheco 2.6.31-14-generic #​48 (Github: #118)-Ubuntu SMP Fri Oct 16 14:05:01 UTC 2009 x86_64 GNU/Linux

Running:

Cherokee Web Server 0.99.27

I configured the php.ini of the cgi (e.j. max_execution_time=180), enough for apache.

I try with two scripts:

<?php

    sleep(15);

seems to work

<?php

    sleep(20);

gets a 504 Gateway Timeout

what should I do?

danielniccoli commented 11 years ago

From mict...@gmail.com on November 18, 2009 23:09:01 Sorry, my issue actually it's 499 (http://code.google.com/p/cherokee/issues/detail?id=499 (Github: #499)&can=1&q=gateway%20error)

danielniccoli commented 11 years ago

From alobbs on November 21, 2009 14:25:25 Bug 499 is most likely tightly related to this one:

http://code.google.com/p/cherokee/issues/detail?id=499 (Github: #499)

Could you please guys try to set a higher timeout in the FastCGI and SCGI rules? Odds are that will solve the issue, although I'm not certain. Some testing is required..

danielniccoli commented 11 years ago

From jja...@gmail.com on November 25, 2009 03:45:06 I just discovered that my reverse proxy (cherokee 0.99.29) usually sends a 5xx gateway timeout or bad gateway when trying to POST a big text (when editing a wiki, for example).

danielniccoli commented 11 years ago

From ste...@konink.de on November 25, 2009 08:00:09 Could this possibly be related to the maximum header length?

danielniccoli commented 11 years ago

From alobbs on November 25, 2009 08:16:33 Something funny must be happening in the post manager. I'm checking it out right away..

danielniccoli commented 11 years ago

From alobbs on November 25, 2009 09:14:33 @jjamor: A few days ago, I committed a few patches for adding custom timeout support to rule entries. That means, a specific rule can define a timeout value, as it can define a encoding method, o a handler plug-in.

The target was to allow PHP connections to last longer and the general timeout limit, so the server wouldn't reply a 504 error if the php-cgi was still (hanged) processing the request when the timeout limit was reached.

I've just realized that, you might be hitting that issue in the back-end servers. Depending on the wiki software processing speed, the back-end web server might be returning 504 errors whenever that process takes longer than the general timeout limit. In that case, the front-end proxy server would be just relying it to the outside world.

I have not probed that theory yet, but it makes good sense to me.

The best way to test it would be to add a custom timeout to the Extension PHP rule of the back-end servers. A 35 seconds value would be enough (since php.ini sets 30 seconds of processing limit by default). Hopefully that will fix the issue.

Cherokee 0.99.30 will ship an improved version of the PHP wizard. It checks the php.ini file in order to figure the right timeout value. In that way it can set the right timeout value to the rule, so it does not return 504 errors when php scripts takes a "long" time to be executed (longer than the server-wide timeout limit).

BTW, I'm talking about PHP, but the underlying idea applies to any other scripting language as well.

danielniccoli commented 11 years ago

From jja...@gmail.com on November 25, 2009 20:21:12 @alobbs, does this apply to my backend, which is using Moin (python w/fcgi, not php)? in any case, I have tested to directly submitting the big POSTs to the backend, and the errors do not occur.

danielniccoli commented 11 years ago

From skar...@gmail.com on November 25, 2009 23:22:32 @jjamor, I think you must apply to your backends and to the reverse proxy rule on the proxy server.

danielniccoli commented 11 years ago

From jja...@gmail.com on November 26, 2009 00:10:49 @alobbs, @skarcha: finally the backend was moved to apache, because it was impossible to get work it with the spawn-fcgi binary (see issue ​591 (Github: #572)). However, the 5xx errors are still here but only when using the reverse proxy (with Cherokee).

I'm going to try compilation of the SVN version and test it ...

danielniccoli commented 11 years ago

From jja...@gmail.com on November 26, 2009 03:09:40 Ok, compiled and setup a timeout of 60 seconds.

The problem still exists: sometimes the proxy server respond with a Bad gateway error. The response is immediate. By analyzing the logs, the backend server always responds correctly.

I'm going to switch the reverse proxy to apache until the problem is really fixed.

danielniccoli commented 11 years ago

From skar...@gmail.com on February 12, 2010 00:32:14 @jjamor: Do you still have this problem?

danielniccoli commented 11 years ago

From jja...@gmail.com on February 12, 2010 08:39:30 I am running cherokee 0.99.41 since 2010-01-26 and I did not need to restart it since then. However sometimes previous versions worked normally a month or so and then started to serve 5xx errors until it is restarted. I'd wait some weeks more before closing this issue.

danielniccoli commented 11 years ago

From skar...@gmail.com on February 12, 2010 08:47:26 Great! Thanks for the follow up ;)

danielniccoli commented 11 years ago

From tah...@gmail.com on June 07, 2010 14:42:59 It seems this was finally fixed in February. Can anybody confirm it so we can close the bug? Thanks guys.

danielniccoli commented 11 years ago

From hendriks.luuk@gmail.com on July 12, 2010 10:16:55 I think I'm suffering this issue now. With a fresh Drupal install, going through the installation steps ends up in a 504 when adding the 'Administrator account' (last step). After that, the Drupal install is available though. But when I try to register a new account, I end up with a 504 after the submit. The error log doesn't show anything. I will provide a trace in a few minutes.

Software and versions: Cherokee 1.0.5 (from source) Drupal 6.17 (used the Cherokee-admin wizard to configure it) Ubuntu Linux (10.04, aptitude updated just today) php-cgi -v: PHP 5.3.2-1ubuntu4.2 with Suhosin-Patch (cgi-fcgi)

danielniccoli commented 11 years ago

From hendriks.luuk@gmail.com on July 12, 2010 10:35:41 Here is the trace. (used CHEROKEE_TRACE="all")

danielniccoli commented 11 years ago

From alobbs on July 12, 2010 10:59:15 @hendriks.luuk: so, it took ~30 seconds until the error message showed up, didn't it?

danielniccoli commented 11 years ago

From hendriks.luuk@gmail.com on July 12, 2010 12:12:40 Indeed, I checked it with another user registration attempt, it took 30 seconds before the 504 showed up.

danielniccoli commented 11 years ago

From hendriks.luuk@gmail.com on July 22, 2010 07:29:53 I tried the same thing on CentOS with Cherokee 1.0.0 (from the EPEL repo) and PHP 5.2 (from Utter-ramblings). No 504 with that configuration!

danielniccoli commented 11 years ago

From henrik.e...@gmail.com on August 25, 2010 12:08:22 I'm having the same issue when performing a long operation (downloading and adding user avatars to databasse). The task itself (PHP) finishes, but the webserver gives up after almost exactly 30 seconds (with the infamous 504 Gateway timeout error)

danielniccoli commented 11 years ago

From henrik.e...@gmail.com on August 25, 2010 12:09:11 Oh, using Cherokee 1.08 on OS X 10.6.

danielniccoli commented 11 years ago

From ste...@konink.de on August 25, 2010 12:11:15 So what is your php execution timeout?

danielniccoli commented 11 years ago

From ggd...@gmail.com on September 18, 2010 07:23:29 Cherokee-1.0.8 and experiencing similar issue when using CMS Made Simple PHP app and working within its admin while downloading/installing add-on modules.

Timeout in Cherokee is set to 30secs, and in php.ini:

default_socket_timeout = 60 max_execution_time = 60

danielniccoli commented 11 years ago

From henrik.e...@gmail.com on September 18, 2010 08:46:41 Execution timeout is set to 3600 seconds. Socket timeout, the same.

danielniccoli commented 11 years ago

From ggd...@gmail.com on October 04, 2010 05:44:02 @alobbs. "So far, it's been quite tough to reproduce. Do you guys know some way to get the server to fail and return those 50x errors?"

I'd say: "Install latest CMS Made Simple (1.8.2) - http://www.cmsmadesimple.org/downloads/ - it'-s really 5min install and then try to install some modules from its admin manager. I regularly get those 504 errors."

Sincerely, Gour

danielniccoli commented 11 years ago

From karion.s...@gmail.com on November 11, 2010 10:36:31 It's to strange because in my VPS i have installed Cherokee 0.99.39-4.1 on Ubuntu 10.04 LTS and it never return 504 Gateway timeout.

But in other servers with 1.0.8 it usually return on all sites when it takes a long time running. Specially occurs with Joomla based sites.

danielniccoli commented 11 years ago

From dait...@gmail.com on November 28, 2010 23:10:40 I'm trying to serve two different Django applications on a single physical server through 2 different hostnames (both dyndns).

I'm using uwsgi to run the Django apps themselves. I have two separate configurations for this, with the only differences being that they use different ports and , obviously, that they point to different Django apps.

I have defined both these uwsgi configs as information sources for Cherokee. Again, no difference except for the name of the config file used.

I have defined two vhosts in Cherokee, again with no differences except their domains and which information source they use.

One of them works fine, the other one has never given me anything but 504 errors! Truly astounding...

Both Django apps run fine using their own development server.

danielniccoli commented 11 years ago

From alobbs on November 29, 2010 08:07:54 @daitake (on comment 49): Looks like the second information source is wrong and Cherokee cannot access to uWSGI back-end. Does it use a different port? Are you certain that the uWSGI server is listening that second port?

danielniccoli commented 11 years ago

From dait...@gmail.com on November 29, 2010 16:58:55 netstat seems to tell me that yes, uwsgi is listening on both ports:

tcp 0 0 localhost.localdo:48337 : LISTEN 2416/uwsgi tcp 0 0 localhost.localdo:48338 : LISTEN 2444/uwsgi

48337 works, 48338 doesn't, but I don't know enough about uwsgi to test that without Cherokee.

I assumed I would have to put them both on different ports, because otherwise, how would uwsgi know which Django installation to serve? Was I wrong?

danielniccoli commented 11 years ago

From alobbs on November 29, 2010 17:16:58 @daitake Could you please double-check that Cherokee has two different Information Sources, that each one of them point to one of those TCP ports, and that the uWSGI rules exporting them use the right source for each case?