OneDrive / onedrive-sdk-python

OneDrive SDK for Python! https://dev.onedrive.com
MIT License
1.08k stars 189 forks source link

download_async, "yield from wasn't used with future`, and downloading large files errors #161

Open arty-hlr opened 5 years ago

arty-hlr commented 5 years ago

Hi,

I'm trying to download files from my OneDrive, and at times I get this:

  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 397, in _error_catcher
    yield
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 479, in read
    data = self._fp.read(amt)
  File "/usr/lib/python3.6/http/client.py", line 449, in read
    n = self.readinto(b)
  File "/usr/lib/python3.6/http/client.py", line 493, in readinto
    n = self.fp.readinto(b)
  File "/usr/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.6/ssl.py", line 1012, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.6/ssl.py", line 874, in read
    return self._sslobj.read(len, buffer)
  File "/usr/lib/python3.6/ssl.py", line 631, in read
    v = self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 750, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 531, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 496, in read
    raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
  File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 415, in _error_catcher
    raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "download.py", line 113, in <module>
    main()
  File "download.py", line 108, in main
    download_all(client)
  File "download.py", line 77, in download_all
    download_all(client, item.id)
  File "download.py", line 77, in download_all
    download_all(client, item.id)
  File "download.py", line 77, in download_all
    download_all(client, item.id)
  [Previous line repeated 1 more time]
  File "download.py", line 91, in download_all
    client.item(drive='me', id=item.id).download(filename)
  File "/usr/local/lib/python3.6/dist-packages/onedrivesdk/request/item_request_builder.py", line 138, in download
    return self.content.request().download(local_path)
  File "/usr/local/lib/python3.6/dist-packages/onedrivesdk/request/item_content_request.py", line 88, in download
    self.download_item(content_local_path)
  File "/usr/local/lib/python3.6/dist-packages/onedrivesdk/request_base.py", line 188, in download_item
    path)
  File "/usr/local/lib/python3.6/dist-packages/onedrivesdk/http_provider.py", line 94, in download
    for chunk in response.iter_content(chunk_size=1024):
  File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 753, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))

which I saw in other issues for uploading errors where it was advised to use upload_async instead of upload.

Trying the same with download_async, just changing from client.item(drive='me', id=item.id).download(filename) to client.item(drive='me', id=item.id).download_async(filename) just goes through it without downloading, and changing it to returned = list(client.item(drive='me', id=item.id).download(filename)) (using to_dict() as mentioned in #51 doesn't work) gives me this:

Traceback (most recent call last):
  File "download.py", line 113, in <module>
    main()
  File "download.py", line 108, in main
    download_all(client)
  File "download.py", line 77, in download_all
    download_all(client, item.id)
  File "download.py", line 77, in download_all
    download_all(client, item.id)
  File "download.py", line 77, in download_all
    download_all(client, item.id)
  File "download.py", line 91, in download_all
    returned = list(client.item(drive='me', id=item.id).download_async(filename))
  File "/usr/local/lib/python3.6/dist-packages/onedrivesdk/request/item_request_builder.py", line 148, in download_async
    entity = yield from self.content.request().download_async(local_path)
  File "/usr/local/lib/python3.6/dist-packages/onedrivesdk/request/item_content_request.py", line 102, in download_async
    yield from future
AssertionError: yield from wasn't used with future

which I can't really make sense of.

Are there any other methods to download large files, or am I doing it wrong with download_async? Weirdly, I saw a lot of issues about upload_async, but none about that.

Thanks in advance for the help!

arty-hlr commented 5 years ago

Even weirder, running the same list(client.item(drive='me', id=item.id).download(filename)) on another laptop, I now get:

  File "download.py", line 113, in <module>
    main()
  File "download.py", line 108, in main
    download_all(client)
  File "download.py", line 77, in download_all
    download_all(client, item.id)
  File "download.py", line 77, in download_all
    download_all(client, item.id)
  File "download.py", line 77, in download_all
    download_all(client, item.id)
  File "download.py", line 91, in download_all
    returned = list(client.item(drive='me', id=item.id).download_async(filename))
  File "/usr/lib/python3.7/site-packages/onedrivesdk/request/item_request_builder.py", line 148, in download_async
    entity = yield from self.content.request().download_async(local_path)
  File "/usr/lib/python3.7/site-packages/onedrivesdk/request/item_content_request.py", line 102, in download_async
    yield from future
RuntimeError: await wasn't used with future

which is the same kind of error, but now it does download for a few seconds while giving that error and than crashes!

KTibow commented 4 years ago

Can you change your title to: "await wasn't used with future" and "yield from wasn't used with future" errors with downloading large files with download_async? It's a lot more clear and I think you might get a bit more help. Also on the second one, basically it's saying instead of yield from future yield from await future should've been used.

KTibow commented 4 years ago

Looking at the examples, it is downloading, but I think it's doing it in the background. Have you tried something like the async example?

import asyncio

@asyncio.coroutine
def run_gets(client):
    coroutines = [client.drive('me').request().get_async() for i in range(3)]
    for future in asyncio.as_completed(coroutines):
        drive = yield from future
        print(drive.id)

loop = asyncio.get_event_loop()
loop.run_until_complete(run_gets(client))   

When you called download() I think it automatically sensed it was a large file and switched to download_async(). This code example may or may not work:

import asyncio

@asyncio.coroutine
def run_gets(client):
    coroutines = [client.item(drive='me', id=item.id).download_async(filename)]
    for future in asyncio.as_completed(coroutines):
        print("Completed.")

loop = asyncio.get_event_loop()
loop.run_until_complete(run_gets(client))   
KTibow commented 4 years ago

Okay, this works:

import asyncio
# Make sure you've authenticated
def download(client, drivename, localname):
    loop = asyncio.get_event_loop()
    loop.run_until_complete(client.item(drive='me', path=drivename).download_async(localname))

download(client, "largefile.whatever", "downloadedfile.whatever")

I haven't tested it on a large file yet but it worked just fine on my test file.