Closed djyotta closed 7 years ago
Can You try
resp = yield from session.get(url, timeout=None)
as @hellysmile says this is simply a case of changing the timeout (default is 5mins).
For reference, here's an example which happily downloads a 10gb file:
import asyncio
from time import time
import aiohttp
async def main():
total_size = 0
start = time()
async with aiohttp.ClientSession() as session:
async with session.get('http://speedtest.tele2.net/10GB.zip', timeout=None) as r:
while True:
chunk = await r.content.read(16144)
if not chunk:
break
total_size += len(chunk)
print(f'{time() - start:0.2f}s, downloaded: {total_size / (1024 * 1024):0.0f}MB')
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Thanks. Looks like I didn't RTFM hard enough.
Timeout is not mentioned on this page: http://aiohttp.readthedocs.io/en/v3.0.1/client_advanced.html#client-session . read_timeout is mentioned here, but pretty hard to spot: http://aiohttp.readthedocs.io/en/v3.0.1/client_reference.html#aiohttp.ClientSession
I don't see timeout documented at all. This timeout default needs to be covered in the introductory ClientSession docs as it's annoying to have to google and dig into the source code to learn that things are dying because of a built-in 5-minute default.
And yes, I spent a better part of a day trying to figure out why my aiohttp-based spider was dying on a handful of large files. Not fun.
I would actually recommend turning off transfer timeouts entirely by default, as HTTP is used for so many things, it is not a great idea to make the assumption that a 5-minute default timeout is actually helpful.
Would you create a Pull Request for documentation improvement?
aiohttp should have reasonable default timeout exactly because HTTP is used for so many things. Hanging for hours without any response to user is even worse than explicit timeout error.
Yes, I can do a pull request. I disagree with your perspective. Consider that a download of a 1GB file can easily take more than 5 minutes. Having aiohttp abort these HTTP transfers in its default configuration is problematic. Clients are requesting files from servers and generally servers do a great job of ending connections. Clients should continue to communicate with the server as long as it is active.
If there is going to be a default timeout, it should not be based on the total connect time but the last time a packet was transmitted. Using total connect time is ineffective. Aborting after 30 seconds of no data transferred either way is a reasonable default. Current approach is not.
Completely disagree, 99.9% of requests which exceed 30s have gone wrong.
Better documentation and a descriptive error would be good but the default shouldn't change
Samuel, this isn't even a matter of opinion. Killing a completely valid and RFC-compliant HTTP connection by default just because you think that it's been running too long so it's "probably wrong" is idiotic. Who are you to make this determination about someone else's HTTP request? I mean no offense to you personally but the opinion you express is frankly worthy of ridicule.
@danielrobbins aiohttp has enough flexibility regarding timeouts. did you have a chance to check documentation or just complaining? in any case your tone is not very appropriate
@fafhrd91 My intent is to be respectful but with very strong disagreement to the decision, which I believe to be foolish, and I can mock the decision without mocking the person. There is no way for you to determine "tone" through electronic communication. Regarding the technical decision, shall we have two versions of aiohttp, "idiot's edition" and "professional edition." In the idiot's edition, let's assume that the user is a moron and kill their HTTP requests after 5 minutes as a service to them because they are probably just being stupid, and because most people are idiots. Then in the professional edition, we can assume that the user actually wants the HTTP request to complete successfully. This may sound like a moronic idea but it's essentially what upstream is doing. The "requests" module allows large transfers to complete successfully because they make the smart decision to be RFC-compliant and not try to "protect" the user from imaginary error conditions.
I don't have any intention to continue any communication. this is open source software, fork and change however you want.
Sure, I'm happy to fork and create patches for Gentoo and Funtoo. And again, no disrespect intended. If you are insulted by having someone think that your idea is bad, sorry, but you'll need get used to it. People expect things to work in an RFC-compliant manner by default, particularly in python modules that claim to be implementing a standard protocol.
Consider this critical but very real customer feedback. I am using aiohttp for non-trivial projects and do not like this default behavior at all. It caused me much frustration.
I get your point of view, but it's just a default. Easy enough to change.
Library decisions that constrain you and can't be overridden are are very annoying but it's hard to see why this default upsets you so much.
@samuelcolvin because as I mentioned, it caused me to waste most of a day trying to track it down. That is frustrating. My frustration is expressed in my very strong negative reaction to this decision. I think it is a poor default, as I am sure others will continue to run into this issue. While improved documentation can be considered a workaround, I just don't see the point of artificially terminating HTTP connections that you deem to have existed for too long. I don't buy the argument that most HTTP connections that have a duration over 5 minutes are invalid. It just seems extremely silly to me. Anyway, I have conveyed my point and now it's time to end this conversation.
Please note that this bug report may also be related to this unfortunate default: https://github.com/aio-libs/aiohttp/issues/2849
There are potentially other bug reports that could be as well. While not confirmed, I would expect regular new bug reports that are ultimately related to this particular problem since it's not RFC-compliant or expected behavior.
@danielrobbins please keep calm.
You messages are very emotional and not respectful at all.
smart
, foolish
and idiotic
are not appropriate terminology.
If you want spit your negative emotions -- please do it outside of the tracker.
This place is not a replacement for psychoanalyst surgery for free.
I don't want to continue this useless and emotionally hot conversation. If you want to discuss a technical question -- please create a new issue.
@aio-libs/aiohttp-committers please don't hesitate to press "Lock conversation" button if needed. Frozen thread can be unlocked easy but there is no possibility to return back written and read offensive words.
Long story short
Long downloads (>5GB) timeout with asyncio.TimeOutError. concurrent.futures._base.TimeoutError
Expected behaviour
Downloading arbitrarily large downloads should work.
Actual behaviour
Fails ~5GB in with asyncio.TimeOut
Steps to reproduce
Try downloading a large file (> 5GB)
Your environment
Workaround
Use the requests library instead