amnong / easywebdav

A WebDAV Client in Python
http://pypi.python.org/pypi/easywebdav/
ISC License
207 stars 113 forks source link

Call to download() results in zero sized files #9

Open jjohanson opened 11 years ago

jjohanson commented 11 years ago

Hello!

When using the latest version of easywebdav,

easywebdav 1.0.7 requests 1.1.0 webdav server: mod_dav, apache2 ubuntu 12.10

I have a problem using downloading files. When doing a download (by calling the download() method), the file is created on the local machine with the correct name, but the file is empty (zero size).

When looking at the code I can see response.raw is used as an input to shutil.copyfileobj(). However, I think that the documentation for requests states that 'stream=True' must be used in the call to session.request for .raw to be valid (http://docs.python-requests.org/en/latest/api/):

"raw = None File-like object representation of response (for advanced usage). Requires that 'stream=True' on the request."

I have tried setting stream=True (line 77 in client.py). and the received file then contains data! However, when looking at the header I can see that my webdav server gzip'ing the files, so the files downloaded have to be unszipped.

*** client-new.py   2013-02-20 08:27:29.941967542 +0100
--- client-org.py   2013-02-18 14:17:59.114443000 +0100
***************
*** 76,78 ****
          url = self._get_url(path)
!         response = self.session.request(method, url, allow_redirects=False, stream=True, **kwargs)
          if isinstance(expected_code, Number) and response.status_code != expected_code \
--- 76,78 ----
          url = self._get_url(path)
!         response = self.session.request(method, url, allow_redirects=False, **kwargs)
          if isinstance(expected_code, Number) and response.status_code != expected_code \

To get around this problem I have just uncommented the 'f.write(response.content)' on line 126 in client.py (and added a comment to 'shutil.copyfileobj(response.raw, f)' on line 127).

As far as I can see using response.raw and something like shutil.copyfileobj() is however necessary to be able to download really large files.

jjohanson commented 11 years ago

Hello again,

I have looked some more at using 'response.raw'. As mentioned above you may end up with zip'ed files when setting 'stream=True' and using 'response.raw'. However, it turns out that the 'read()' method has an optional parameter 'decode_content' that can be set to True to have the response decoded (unzipped in this case):

response.raw.read(decode_content=True)

So the following code are using 'stream=True', and response.raw (with response.raw.read(decode_content=True) called in a loop to copy all data).

The documentation on .read() states that you can not specify an amount to read and at the same time set 'response.raw.read(decode_content=True)'. So this still leaves the question about the ability to read (very) large files. I have tried reading files up to about 600MB and I can see that the file is transferred in a single call to .read() (the while loop is only passed one). So, how to transfer really large files?

Jørgen

--- /home/jojo/Downloads/easywebdav-1.0.7/easywebdav/client.py  2012-11-13 23:31:47.000000000 +0100
+++ /usr/local/lib/python2.7/dist-packages/easywebdav-1.0.7-py2.7.egg/easywebdav/client.py  2013-02-24 09:51:50.638723255 +0100
@@ -74,7 +74,7 @@
             self.session.auth = (username, password)
     def _send(self, method, path, expected_code, **kwargs):
         url = self._get_url(path)
-        response = self.session.request(method, url, allow_redirects=False, **kwargs)
+        response = self.session.request(method, url, allow_redirects=False, stream=True, **kwargs)
         if isinstance(expected_code, Number) and response.status_code != expected_code \
             or not isinstance(expected_code, Number) and response.status_code not in expected_code:
             raise OperationFailed(method, path, expected_code, response.status_code)
@@ -124,7 +124,12 @@
         response = self._send('GET', remote_path, 200)
         with open(local_path, 'wb') as f:
             #f.write(response.content)
-            shutil.copyfileobj(response.raw, f)
+            #shutil.copyfileobj(response.raw, f)
+            line = response.raw.read(decode_content=True)
+            while line:
+                f.write(line)
+                line = response.raw.read(decode_content=True)
+
     def ls(self, remote_path='.'):
         headers = {'Depth': '1'}
         response = self._send('PROPFIND', remote_path, (207, 301), headers=headers)

read(): http://urllib3.readthedocs.org/en/latest/helpers.html#module-urllib3.response

http://docs.python-requests.org/en/latest/user/advanced/ http://docs.python-requests.org/en/latest/api/

amnong commented 10 years ago

I'm going to release v1.0.8 after a really long while - please let me know if the problem persists with the new version.

blootsvoets commented 9 years ago

Works for me with requests==2.7.0 easywebdav==1.2.0 python==2.7.10

Jay54520 commented 7 years ago

This worked for me:

            response.raw.decode_content = True
            shutil.copyfileobj(response.raw, out_file)

because

 the response.raw file-like object will not, by default, decode compressed responses (with GZIP or deflate). You can force it to decompress for you anyway by setting the decode_content attribute to True

ref: https://www.codementor.io/tips/3443978201/how-to-download-image-using-requests-in-python