ross / requests-futures

Asynchronous Python HTTP Requests for Humans using Futures
Other
2.11k stars 152 forks source link

Flow control can, in some rare cases, not return from a child process to parent #35

Closed asfaltboy closed 2 years ago

asfaltboy commented 8 years ago

We've been using ProcessPoolExecutor (from #31) in production for a while now. It seems that in some rare cases we "lose control" of a child process in the pool, which makes the whole batch of requests unusable from that moment on, i.e flow control is never returns to parent.

This may be caused by exceptions not properly being raised in child (possibly the requests.exceptions.Timeout), or another reason for child to hang. It seems a good workaround would be to call Future.result(timeout=SANE_TIMEOUT_VALUE) so that even in such cases control would still be returned to parent. Not sure there is much requests-futures can offer here, I'll try the workaround and if it works will close the issue for now, so it is at least logged for reference.

ross commented 8 years ago

Interesting. Happy to take a :eye: at a PR that sets a timeout, but it might get a little complicated to make sure that plays well with requests timeouts. Let me know what you find.

asfaltboy commented 8 years ago

@ross thanks for the feedback. I used requests.get(timeout=X) and Future.result(timeout=X + GRACE) in case requests' exception in child is thrown after a delay.

Unfortunately, my plan failed, due to my lack of understand of the problem. Even with timeout limits set on the children process, the child still "is suspended in time" and my logs show odd behaviour of the parent continuing, while the pool is not cleared 10 minutes after due finish time. ps shows the child processes in the D state - "Uninterruptible sleep (usually IO)". There must be something I'm missing.

I plan to simplify our usage by removing most callback/result to after all requests are done. This should give a good indication of where things are broken.

One other interesting thing I saw, is when I stop a test by raising KeyboardInterrupt, the exception shows this trace:

  File "my_code", line 594, in some_method
    response = res.result(timeout=CHILD_PROCESS_TIMEOUT)
  File "/home/pavel/.pyenv/versions/3.5.1/lib/python3.5/concurrent/futures/_base.py", line 400, in result
    self._condition.wait(timeout)
  File "/home/pavel/.pyenv/versions/3.5.1/lib/python3.5/threading.py", line 297, in wait
    gotit = waiter.acquire(True, timeout)
KeyboardInterrupt

I wonder if concurrent.futures runs a separate thread to manage the process pool 😕

waypointsoftware commented 8 years ago

I have this same issue. Some of my requests are generating gateway timeouts, these exceptions are caught somewhere inside futures and control is not returned back to the calling application correctly. Meaning that in this pseudo code:

for a in assignments: response = a.send_request(...) responses.append(response) wait(responses)

The "wait(responses)" never gets hit.

Is there some other way to setup exception handlers for requests?

ross commented 8 years ago

Sorry @waypointsoftware whatever is happening isn't likely within requests-futures's domain to handle. It just provides a framework that uses futures. The process management etc is external to it. If you can provide a simplified failing test case I can try and take a look and help figure out what's up, but otherwise I won't be able to provide much help.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.