tornadoweb / tornado

Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
http://www.tornadoweb.org/
Apache License 2.0
21.78k stars 5.51k forks source link

CurlAsyncHTTPClient should warn when max_clients exceeded #2127

Open ssb22 opened 7 years ago

ssb22 commented 7 years ago

max_clients in CurlAsyncHTTPClient defaults to 10 (which seems low for high-load applications), and there is no option to warn when its _process_queue can't fit all self._requests into self._free_list. Surplus requests are queued—but deadlock can occur if we're fetching from a back-end server which will not respond to the requests in progress until the queued requests are sent.

In my application, I needed to do some processing both downstream and upstream of a legacy proxy which I must treat as a "black box". So when a request R0 comes in from the client, I do things to it and then send a request R1 to the black box. Then the black box makes a request R2 back to me on a different port, and I do things to that and send a request R3 elsewhere. When I get the response from R3, I can send the reply for the black box's request R2, and that will cause the black box to reply to my R1 request and finally I can reply to R0. Notice that I won't get the response from R1 until I've finished servicing R2, which I can't do until I've sent and handled R3. So R1 depends on R3, so if R3 is put into a queue waiting for R1 to finish, I'm in trouble. Of course there are many ways I can work around this problem: I can set a larger value of max_clients to decrease the chances of that queue having to come into play, or I can run a completely different Tornado process for the other port, or something. But the issue was I had a deadlock and (for a few hours) no idea why. If there were some way of turning off the queue and raising an exception if I overload _free_list, or at least logging a warning, that would have saved some debugging. Thanks.

bdarnell commented 7 years ago

The default is small because "high-load applications" are expected to be running in many processes, each of which gets their own copy of the limit, and because running with a limit that's too high makes it easy to run into automated abuse-detection systems (or just generally be a bad citizen). This should perhaps be made more visible, but the fact that you have to opt in to sending a lot of concurrent requests is by design. I think it would be too noisy to log every time a request is queued, but maybe a log the first time it happens in a process would be OK.

Note that increasing the limit makes the problem less likely to occur, but cannot eliminate this deadlock if you have a dependency cycle like this (and in CurlAsyncHTTPClient you can't just set the limit to a billion to have an effectively unlimited budget, since it preallocates stuff for each "client"). Instead, the best practice would be to test your application with max_clients=1 and make sure it can work in this mode. In this case, I think that would mean creating multiple AsyncHTTPClients in each process (using force_instance=True) and using different instances for different purposes.