Closed viktordick closed 4 years ago
So there's 3 places to care about.
In general (arbitrary WSGI apps) there's always been an exception raised by webob if the socket is closed while reading the body. In waitress, this will never happen in your application thread because it buffers the request prior to running your app code.
In waitress 1.3 we added a new feature in which an exception will be raised if a write occurs to a closed socket which propagates up to the application. Unfortunately it requires an attempted write (normally at the end of processing). Attempted writes cannot occur at least until headers have been set via start_response
.
I think there is a missing feature in which you could look in the environ and see a flag about whether the client socket has closed. This would need to be a custom flag, there's no standard WSGI flag to poll/check related to this.
Yes, the end of processing is usually too late to actually lessen the strain on the server.
I was able to implement this feature, but it requires to mark an HTTPChannel
to be readable even while a request ist processed. If I understand correctly, a channel is designed to be able to process multiple requests over the same connection, but since the service
method uses a task_lock
, only one task will be active at each time. So I hope this will not cause any problems.
Are you interested in a pull request for this?
A channel is open for keep-alive/pipelining but as you noted each request/task is executed serially from a channel. I'm not sure why it needs to be marked readable, maybe just to detect a disconnect? I want @bertjwregeer to give his 2 cents but I think the goal is worthwhile. I think the main sticking point is on how it may affect performance and how it bubbles up to the WSGI app. Probably via a function call available from the environ like environ['waitress.is_client_alive']()
.
If we mark the channel as "readable" it means we get notified when there is new data to read. If we then don't read that data, we end up basically looping at 100% cpu time because select()/poll() will continue to notify us. We don't want to allow a client to keep HTTP pipelining requests that we have to buffer, potentially causing a DDoS against the server as it has nowhere to store the information either though.
The best way to tell the remote client to slow down is to simply stop reading the packets from the wire, until the previous request is finished processing.
The reason it needs to be marked as either readable or writeable is so that a disconnect is bubbled up from select()/poll(), it isn't until you attempt any sort of action on a socket that the OS will send a notification that the socket has entered an error state.
Feel free to make a PR, but it is very unlikely that I will accept it as is.
What about writing out 0 bytes or a newline or something? What did zope do? I assume read since OP suggested it.
Thanks I did not think about that.
However, if all an attacker wants to do is flood the server's memory, they could also send one huge request instead of pipelining many requests I guess? Not sure if there are any limits implemented in waitress, even though I can see that it would be easier to implement limits for a single request (checking the Content-Length
and rejecting immediately if it is too large - not sure how chunking would be handled here).
I read up a little bit on HTTP and in particular it seems that no browser actually uses pipelining by default with HTTP/1.1. Therefore, waitress could handle almost all use cases by keeping the socket readable even if a request is already queued, but only for at most one read operation. The condition "there is no request being processed" in readable()
would become "there is no request being processed or there has been at least one read beyond the current request".
That way, a disconnect after the current request could still be detected. If a client is using pipelining and sends N
requests through the same connection without waiting for the response after each request and disconnects afterwards, waitress would not be able to detect this and be forced to process the first N-1
requests and only the last one would be discarded (although since the response of the first request could not be written, it would probably only process the first one). So this case would be unchanged to the current behavior.
However, according to Wikipedia, this is not a real-world scenario as long as HTTP/2 is not supported anyway, and for that the whole structure has to changed anyway I guess?
-+
What about writing out 0 bytes or a newline or something? What did zope do? I assume read since OP suggested it.
Zope 2 did not handle this situation itself, we always patched our installation. The patched place was somewhere around here. I am not the author of the original patch and have not taken the time to completely understand how it works. I am not actually sure if it was susceptible to DoS attacks. I am now trying to get a solution for Zope 4 with waitress, and if possible not simply by patching our installations but by contributing to the community.
I'd have to see the patch to get an idea of how that changed the Zope behaviour. Waitress already rejects requests that are too large.
Yes, real life clients may not use HTTP request pipelining, it however is explicitly allowed by the standard and waitress implements it, I am not going to change its behaviour in a way that may lead to potential vulnerabilities just because mainstream clients don't implement it.
Most if not all of the reverse proxies on the market do support HTTP pipelining and in fact use it fairly extensively with keep-alive requests to allow them to keep a connection open (I don't know if they send requests rapid fire though... so maybe not as much of a concern, but still).
The condition "there is no request being processed" in readable() would become "there is no request being processed or there has been at least one read beyond the current request".
But even if there is one read beyond the current request doesn't mean the remote client is still connected, let's say this happens:
This uses the new logic you just created in step 6, it just checks that there is more to read beyond the current request. To actually make the client having gone away be something you can see in your work thread, we'd have to actually KEEP reading from the socket so that we can continue to have the socket be in the select()
call so that when it goes away we get an error from read()
to process.
Let me think about this a little bit and see if there is a good way to make this happen with the Unix socket options and one that is portable across operating systems, but your simplistic change unfortunately is not tenable.
Unfortunately the socket API's are really bad around this. @mmerickel's suggestion to send whitespace works, but is crude and means you end up doing all of your processing in a loop in __iter__
or by looping over the WSGI write.
I am not going to change its behaviour in a way that may lead to potential vulnerabilities just because mainstream clients don't implement it.
Yes, but unless I overlooked something, the proposed change would not change anything if the client uses pipelining. However, it would also not help in that case, a client disconnect would only be detected for the currently running request if no requests beyond that have already been pipelined. Under the assumption that pipelining does not occur in real life, this would be a satisfying solution, but I was not aware that reverse proxies use it.
If there is a better solution to detect a socket close without reading everything from it and without polling at 100% CPU, this of course would be better, but I am not aware of such a possibility if the basic loop uses select
.
I will create a draft PR with the outlined idea once I get to it - which will probably not be today, but it should be this week.
The ZServer / ZRendezvous patch I made is a bit more complex than described above, but only because it also handles ugly details like database backend termination, which are of no concern here (yet).
I was lucky that the Medusa ZServer reliably calls "close()" on its http_channel
when the client disconnects (i.e., it keeps the socket readable at all times). Therefore, my ZServer patch only made additions to the http_channel.close()
method, and added some bookkeeping on the current request. This amounts to the approach already suggested by @viktordick .
To protect against a DoS by way of many small requests as described by @bertjwregeer , I suggest introducing a maximum queue length variable, which prevents the server from reading the socket further.
Reaching the queue length would then blind the server for not only more requests, but also for disconnect signals, but this is acceptable in my view. As long as we are below the maximum queue length, this would produce the same reliable disconnect detection capability as Medusa/ZServer had.
Just to drive the point home, and at the risk of preaching to the choir: What we saw frequently in our Zope installations is that when the client disconnects, our server continues churning on and producing results for no-one, to very detrimental effects. If we have an F5 fan out there, he can give our server a very hard time indeed. This is a way of producing a DoS at no cost at all for the attacking client. It happens frequently in real life, and the attacker is often not even aware that he is attacking… As such, I think of this scenario as a huge threat which we absolutely need to fix.
How often are you checkpointing that the client has disconnected? Because Waitress may be able to add a flag to the WSGI environment, but it can not stop any processing on it's own, you will have to manually check at certain points in your logic that the flag is set (or call the function if we go the route @mmerickel suggested of adding a function into the WSGI environment).
Where is all that CPU time being spent? All of my applications return results in less than ~300 ms. Any longer than that and users will stop navigating the website.
Are you really taking seconds to render websites (thereby causing users to hard refresh the page because they think it is stuck)? Only then does this become something I could understand needing to protect against.
Protecting against this with a reverse proxy that does dog pile caching, so that the first request fires, and then that result is cached, and the next time the client comes back, it waits for the previous response to come back.
In ZServer/Medusa there also was no possibility for the server to stop any processing. We currently have only one crucial checkpoint in the application, before we send a query to the database.
The CPU time is being spent in database queries, not all of which we can optimize, because they are mostly dynamically created. The ~300 ms you talk about are common for the simpler pages we create, but we have many database search and modification queries which take significantly longer. Our average page creation time varies widely from installation to installation, with less than 1s in some, and more like 5s in others. But there also are outliers, which can easily take tens of seconds.
Some sites also use complex booking jobs involving interfaces to other systems, whose performance we cannot control. Still, the requirement is to perform the bookings live within our pages. In some cases we switch to polling / websockets notification and run the jobs in the background, but some of our code has spent some time in the field and does not use this kind of tech.
The users get a "please wait" notification, and know about the stuff going on behind the scenes, but still things can happen.
Dog pile caching sounds like a good idea to prevent the simplest of such situations, thanks for the hint! Still, I'm afraid there are some scenarios which we still need to protect against.
I created a draft PR implementing the idea outlined in this discussion, see #310.
I tested it in Zope with a handler that stores a flag in the REQUEST
if it notices the client disconnecting and a page that runs in a loop that aborts once it finds the flag or 10 seconds have passed.
Then I pipelined a few requests, using
#!/usr/bin/python3
import time
import socket
req = '''\
GET /loop HTTP/1.1
Host: localhost:8080
Connection: keep-alive
'''.replace('\n', '\r\n').encode('ascii')
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("127.0.0.1", 8080))
for i in range(4):
s.sendall(req)
time.sleep(5)
s.close()
Since up to 5 requests are read beyond the currently running one, this correctly aborts the running request after 5 seconds, with the CPU load sinking again. If I increase the number of requests that are sent, the first request is finished (so the CPU load goes up for 10 seconds) and afterwards it dies down.
We recently switched from using Zope 2 to Zope 4, now using
waitress
as web server and have tried to look into how to re-implement a patch that we had implemented in Zope 2 before.The idea is that end users that are impatient and reload a page that takes too long to load or close the page often make the problem worse by putting more strain on the server. The first request is still being processed even though there is no one out there that is still expecting the result. Our solution was to put a signal into the request object and some strategic places inside the application server checked for the signal, aborting if it was found (after all, it is not possible to kill threads).
Is there a way to accomplish something similar with
waitress
? I am not sure if this goes against the whole spirit of the WSGI approach, but maybe it is possible to change theenvironment
after the fact and check for it inside the application? Or maybe even allow an application to insert a callable into the environment and call it when the client disconnects?I am not even sure how to detect this situation in the approach that
waitress
is using. In principle, if a client disconnects, the socket becomes readable but returns a zero-length byte string whenrecv
is called. However, once a request is completely transferred and the task is created, the corresponding socket seems to no longer be in the list of sockets that are checked for readability in the 'select' loop.I thought I'd ask here because others might face the same problem and maybe the community is interested in creating a solution for it.