ross / requests-futures

Asynchronous Python HTTP Requests for Humans using Futures
Other
2.11k stars 152 forks source link

Attribute set via response hooks does not work with ProcessPoolExecutor #137

Closed nickryand closed 1 year ago

nickryand commented 1 year ago

I have not been able to debug deep into this yet but I wanted to drop it here so other people may see the same issue.

I took your example from #105 and set the executor to ProcessPoolExecutor

#!/usr/bin/env python

from requests_futures.sessions import FuturesSession
from concurrent.futures import ProcessPoolExecutor
from pprint import pprint

def get_translation(resp, *args, **kwargs):
    resp.translation = 42

session = FuturesSession(executor=ProcessPoolExecutor(max_workers=2))

session.hooks["response"].append(get_translation)

segmented = {
    0: "zero",
    1: "one",
    2: "two",
}
requests = {}
for idx, word in segmented.items():
    future = session.get(f"https://nghttp2.org/httpbin/?idx={idx}")
    requests[idx] = future
pprint(requests)

for idx, response in requests.items():
    requests[idx] = response.result()
pprint(requests)

for idx, response in requests.items():
    pprint([idx, response.translation])

The result is:

python tests/example.py
{0: <Future at 0x7fffe8d615d0 state=running>,
 1: <Future at 0x7fffe8d62950 state=pending>,
 2: <Future at 0x7fffe8d62bd0 state=running>}
{0: <Response [200]>, 1: <Response [200]>, 2: <Response [200]>}
Traceback (most recent call last):
  File "/home/tests/example.py", line 32, in <module>
    pprint([idx, response.translation])
                 ^^^^^^^^^^^^^^^^^^^^
AttributeError: 'Response' object has no attribute 'translation'

ThreadPoolExecutor works without issue.

ross commented 1 year ago

For the most part ProcessPoolExecutor is caveat emptor as passing python objects around between processes can be tricky and a lot of times the stuff that gets run into can't be addressed by requests_futures (as it's actually problems with underlying libraries.)

My only thought would be to add some debugging prints in get_translation to see if it's even getting called. It's possible requests session hooks don't make it over into the other process correctly.

nickryand commented 1 year ago

I think I have tracked this down to behavior with pickle. Pickle calls a couple of methods dealing with getting and setting state during the marshal and unmarshal process. My guess is requests Response object does not handle that use case.

I'll dig more. I don't know if there is a good fix for this if you are trying to keep support for older versions of python.

ross commented 1 year ago

Ah, yeah that's exactly the sort of problem that's been run into a number of times over the years. It'd require requests fixing it as I don't really have the ability to make its objects pickle-able w/o some really nasty monkey patching that I wouldn't even consider doing.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.