ross / requests-futures

Asynchronous Python HTTP Requests for Humans using Futures
Other
2.11k stars 152 forks source link

Problem with data field in header request #108

Closed David-DE-001 closed 3 years ago

David-DE-001 commented 3 years ago

I got problem with the code below:

in the sample code it worked well but in data in request's body, the number is jumped randomly

# !pip install requests_futures
from concurrent.futures import as_completed
from pprint import pprint
from requests_futures.sessions import FuturesSession

session = FuturesSession()
data = {
    'code' : 'abc123'
}
futures=[]
for i in range(10):
    code = '{0:05}'.format(i)
    data['code'] = 'abc{}'.format(code)
    future = session.get(f'http://httpbin.org/delay/{i}',data=data)
    future.i = i
    futures.append(future)

for future in as_completed(futures):
    resp = future.result()
    pprint({
        'i': future.i,
        'code': resp.request.body,
        # 'content': resp.json(),
        'url': resp.url
    })

The result we expect is abc0001, abc0002,... abc0009 like the url but the result is different. I know it is caused by concurrent program but I don't know how to debug, anyone can help me?

{'code': 'code=abc00000', 'i': 0, 'url': 'http://httpbin.org/delay/0'}
{'code': 'code=abc00002', 'i': 1, 'url': 'http://httpbin.org/delay/1'}
{'code': 'code=abc00004', 'i': 2, 'url': 'http://httpbin.org/delay/2'}
{'code': 'code=abc00003', 'i': 3, 'url': 'http://httpbin.org/delay/3'}
{'code': 'code=abc00004', 'i': 4, 'url': 'http://httpbin.org/delay/4'}
{'code': 'code=abc00009', 'i': 5, 'url': 'http://httpbin.org/delay/5'}
{'code': 'code=abc00007', 'i': 6, 'url': 'http://httpbin.org/delay/6'}
{'code': 'code=abc00007', 'i': 7, 'url': 'http://httpbin.org/delay/7'}
{'code': 'code=abc00009', 'i': 8, 'url': 'http://httpbin.org/delay/8'}
{'code': 'code=abc00009', 'i': 9, 'url': 'http://httpbin.org/delay/9'}

if I update code to: futures = [session.get('f'httpbin.org/delay{i}',data={'code':'abc{0:05}'.format(i)'} for i in range(10)] it will be okie I posted it in StackOverflow as well https://stackoverflow.com/questions/68195535/issue-with-adding-headers-and-data-for-requests-futures-get-in-python

ross commented 3 years ago

Thanks for the runnable snippet! That always makes tracking down what's up way easier.

The bit that's tripping things up is that a single data dict defined before the for loop that is being reused across all the calls to get. So as the loop changes the value of code the async requests happening in the background see the changes before they manage to get things sent off. In my test run of the original code 2 requests were sent while the code was at 7 and the rest were sent while it was 9.

(env) coho:tmp ross$ python issue.py
{'code': 'code=abc00007', 'i': 0, 'url': 'http://httpbin.org/delay/0'}
{'code': 'code=abc00007', 'i': 1, 'url': 'http://httpbin.org/delay/1'}
{'code': 'code=abc00009', 'i': 2, 'url': 'http://httpbin.org/delay/2'}
{'code': 'code=abc00009', 'i': 3, 'url': 'http://httpbin.org/delay/3'}
{'code': 'code=abc00009', 'i': 4, 'url': 'http://httpbin.org/delay/4'}
{'code': 'code=abc00009', 'i': 5, 'url': 'http://httpbin.org/delay/5'}
{'code': 'code=abc00009', 'i': 6, 'url': 'http://httpbin.org/delay/6'}
{'code': 'code=abc00009', 'i': 7, 'url': 'http://httpbin.org/delay/7'}
{'code': 'code=abc00009', 'i': 8, 'url': 'http://httpbin.org/delay/8'}
{'code': 'code=abc00009', 'i': 9, 'url': 'http://httpbin.org/delay/9'}

If instead of sharing/updating data I pass in a new data to each get the expected behavior will happen. You can think of the data dict as being a pass by reference variable, so any changes made to it update the object all the references point at. The change below still works the same, ref-wise, but it's a new object each time so their values (code) don't get changed/updated.

...
    data = {
        'code': 'abc{}'.format(code),
    }
    future = session.get(f'http://httpbin.org/delay/{i}',data=data)
...
(env) coho:tmp ross$ python issue-tweak.py
{'code': 'code=abc00000', 'i': 0, 'url': 'http://httpbin.org/delay/0'}
{'code': 'code=abc00001', 'i': 1, 'url': 'http://httpbin.org/delay/1'}
{'code': 'code=abc00002', 'i': 2, 'url': 'http://httpbin.org/delay/2'}
{'code': 'code=abc00003', 'i': 3, 'url': 'http://httpbin.org/delay/3'}
{'code': 'code=abc00004', 'i': 4, 'url': 'http://httpbin.org/delay/4'}
{'code': 'code=abc00005', 'i': 5, 'url': 'http://httpbin.org/delay/5'}
{'code': 'code=abc00006', 'i': 6, 'url': 'http://httpbin.org/delay/6'}
{'code': 'code=abc00007', 'i': 7, 'url': 'http://httpbin.org/delay/7'}
{'code': 'code=abc00008', 'i': 8, 'url': 'http://httpbin.org/delay/8'}
{'code': 'code=abc00009', 'i': 9, 'url': 'http://httpbin.org/delay/9'}

Fwiw, this isn't specific to requests-futures. The same thing could happen if there were background workers just printing out their parameters in the background.