orthecreedence / wookie

Asynchronous HTTP server in common lisp
http://wookie.lyonbros.com/
MIT License
189 stars 19 forks source link

closed streamish when running behind nginx? #71

Closed iconofsin closed 9 years ago

iconofsin commented 9 years ago

I first came across this when trying to replicate a production prototype (running on Amazon Linux) on my local workstation (running CentOS 6.6 x64). When accessing a Wookie app directly, everything works as expected. However, when accessing through nginx (which is the production scenario), every request produces an error like this one:

(event) Closed streamish being operated on: #<TCP-SOCKET {100342D653}>. (NIL)
(wookie-doc) UNcaught error: Closed streamish being operated on: #<TCP-SOCKET {100342D653}>

This specific message is from trying to run wookie-doc's head revision without any modifications and using the sample nginx config from Wookie's Best practices.

I'm pretty sure this is an OS configuration issue (like the file limit or something, but probably not file limit, shouldn't 130k be enough for this case?) because I have an app running on EC2 (behind nginx), and I have never seen anything even remotely resembling this error there.

Any ideas?

orthecreedence commented 9 years ago

That error says to me that Wookie is trying to send a response on a socket it believes to be open, but has been closed (either by the client, and then by nginx, or just by nginx itself).

I'm wondering if this is a keepalive issue. Do you have a keepalive set in nginx.conf/yoursite.conf? Eg.

    keepalive_timeout  10;

EDIT: The above directive would be in the http { ... } section

orthecreedence commented 9 years ago

Reading about this more, it seems keepalive_timeout is kind of a stab in the dark, and if it actually works I'm not sure why. It controls client <--> nginx connection timeouts, not nginx <--> backend (which is where the problem lies). It's possible that if the client keeps the connection to nginx, nginx will keep that connection open the the upstream server.

That's all assuming you don't have a keepalive_timeout set though. If you do, I'd want to see the headers the client is sending to nginx, and the headers nginx is sending to the upstream server.

iconofsin commented 9 years ago

OK, I managed to pull this from Wireshark:

GET / HTTP/1.1
Host: localhost
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.152 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cookie: _pk_id.7.1fff=712a93e17748b9e4.1436960530.2.1437038497.1436960542.; _pk_ses.7.1fff=*

GET / HTTP/1.0
Host: localhost
X-Forwarded-For: 127.0.0.1
Connection: close
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.152 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cookie: _pk_id.7.1fff=712a93e17748b9e4.1436960530.2.1437038497.1436960542.; _pk_ses.7.1fff=*

If I'm reading this correctly, nginx seems to send Connection: close when forwarding the request. keepalive_timout is set to 65; setting it to 0 and removing it altogether has no effect. Also, this is a fresh install of CentOS 6.6 (I figured the old system might have been broken).

UPD: Sending "Connection: close" is a normal behavior according to the RFC. So... unsure where to dig in next.

orthecreedence commented 9 years ago

Sorry for the delay, had some very busy weeks. You'll be happy to know I'm now getting this exact same problem in one of my apps, so hopefully a solution will follow.

orthecreedence commented 9 years ago

Can you try the master branch? I think this is fixed. My guess is that nginx closes the backend connection, and when the app itself (wookie) decides to close it, it's already closed. The solution is to check in the close function if it has already been closed before throwing an error. See https://github.com/orthecreedence/cl-async/commit/91789b0cd433b23df7dd8b686b18f1a7fecad4a4

orthecreedence commented 9 years ago

Closing for now, let me know if you experience this again.