tomgross / pcloud

A Python implementation of the pCloud API
MIT License
95 stars 28 forks source link

copy_file doesn't work on large files #16

Closed blasterspike closed 4 years ago

blasterspike commented 5 years ago

I want to copy a file that is ~12M to pCloud. To do it, I'm using the following code that is using pyfilesystem2

import fs
import urllib
from pcloud import PyCloud
import requests
import logging
import http.client as http_client

http_client.HTTPConnection.debuglevel = 1

# You must initialize logging, otherwise you'll not see debug output.
logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)
requests_log = logging.getLogger("requests.packages.urllib3")
requests_log.setLevel(logging.DEBUG)
requests_log.propagate = True

with fs.open_fs('pcloud://{username}:{password}@/'.format(
        username=urllib.parse.quote_plus('***'),
        password=urllib.parse.quote_plus('***'))) as pcloud_fs:
    with fs.opener.open_fs('/') as linux_fs:
        fs.copy.copy_file(src_fs=linux_fs,
                          src_path='/var/lib/fail2ban/fail2ban.sqlite3',
                          dst_fs=pcloud_fs,
                          dst_path='/fail2ban.sqlite3')
        print('Completed')

This is what I'm getting back

DEBUG:pycloud:Doing request to https://api.pcloud.com/getdigest
DEBUG:pycloud:Params: {}
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.pcloud.com:443
send: b'GET /getdigest HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: CloudHTTPd-API v1.1
header: Date: Wed, 27 Mar 2019 20:33:19 GMT
header: Content-Type: application/json; charset=utf-8
header: Content-Length: 138
header: ETag: "***"
header: Cache-Control: private, max-age=0
header: Vary: Accept-Encoding
header: Connection: keep-alive
header: Keep-Alive: timeout=1800
DEBUG:urllib3.connectionpool:https://api.pcloud.com:443 "GET /getdigest HTTP/1.1" 200 138
DEBUG:pycloud:Doing request to https://api.pcloud.com/userinfo
DEBUG:pycloud:Params: {'getauth': 1, 'logout': 1, 'username': '***', 'digest': '***', 'passworddigest': '***'}
send: b'GET /userinfo?getauth=1&logout=1&username=***&digest=***&passworddigest=*** HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: CloudHTTPd-API v1.1
header: Date: Wed, 27 Mar 2019 20:33:19 GMT
header: Content-Type: application/json; charset=utf-8
header: Content-Length: 744
header: ETag: "***"
header: Cache-Control: private, max-age=0
header: Vary: Accept-Encoding
header: Connection: keep-alive
header: Keep-Alive: timeout=1800
DEBUG:urllib3.connectionpool:https://api.pcloud.com:443 "GET /userinfo?getauth=1&logout=1&username=***&digest=***&passworddigest=*** HTTP/1.1" 200 744
DEBUG:pycloud:Doing request to https://api.pcloud.com/file_open
DEBUG:pycloud:Params: {'auth': '***', 'path': '/fail2ban.sqlite3', 'flags': 64}
send: b'GET /file_open?auth=***&path=%2Ffail2ban.sqlite3&flags=64 HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: CloudHTTPd-API v1.1
header: Date: Wed, 27 Mar 2019 20:33:20 GMT
header: Content-Type: application/json; charset=utf-8
header: Content-Length: 50
header: ETag: "***"
header: Cache-Control: private, max-age=0
header: Vary: Accept-Encoding
header: Connection: keep-alive
header: Keep-Alive: timeout=1800
DEBUG:urllib3.connectionpool:https://api.pcloud.com:443 "GET /file_open?auth=***&path=%2Ffail2ban.sqlite3&flags=64 HTTP/1.1" 200 50
send: b'POST /file_write HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Length: 2097594\r\nContent-Type: multipart/form-data; boundary=***\r\n\r\n'
send: b'--***\r\nContent-Disposition: form-data; name="fd"\r\n\r\n1\r\n--***\r\nContent-Disposition: form-data; name="data"\r\n\r\nSQLite format 3\x00\x10\x00\x01\x01\x00@  *****'
DEBUG:pycloud:Doing request to https://api.pcloud.com/file_close
DEBUG:pycloud:Params: {'auth': '***', 'fd': 1}
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (2): api.pcloud.com:443
send: b'GET /file_close?auth=***&fd=1 HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: CloudHTTPd-API v1.1
header: Date: Wed, 27 Mar 2019 20:33:21 GMT
header: Content-Type: application/json; charset=utf-8
header: Content-Length: 67
header: X-Error: 1007
header: ETag: "***"
header: Cache-Control: private, max-age=0
header: Vary: Accept-Encoding
header: Connection: keep-alive
header: Keep-Alive: timeout=1800
DEBUG:urllib3.connectionpool:https://api.pcloud.com:443 "GET /file_close?auth=***&fd=1 HTTP/1.1" 200 67
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1275, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1055, in _send_output
    self.send(chunk)
  File "/usr/local/lib/python3.7/http/client.py", line 977, in send
    self.sock.sendall(data)
  File "/usr/local/lib/python3.7/ssl.py", line 1015, in sendall
    v = self.send(byte_view[count:])
  File "/usr/local/lib/python3.7/ssl.py", line 984, in send
    return self._sslobj.write(data)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 367, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1275, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1055, in _send_output
    self.send(chunk)
  File "/usr/local/lib/python3.7/http/client.py", line 977, in send
    self.sock.sendall(data)
  File "/usr/local/lib/python3.7/ssl.py", line 1015, in sendall
    v = self.send(byte_view[count:])
  File "/usr/local/lib/python3.7/ssl.py", line 984, in send
    return self._sslobj.write(data)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 76, in <module>
    dst_path='/fail2ban.sqlite3')
  File "/usr/local/lib/python3.7/site-packages/fs/copy.py", line 144, in copy_file
    _dst_fs.upload(dst_path, read_file)
  File "/usr/local/lib/python3.7/site-packages/fs/base.py", line 1313, in upload
    tools.copy_file_data(file, dst_file, chunk_size=chunk_size)
  File "/usr/local/lib/python3.7/site-packages/fs/tools.py", line 57, in copy_file_data
    write(chunk)
  File "/usr/local/lib/python3.7/site-packages/pcloud/pcloudfs.py", line 51, in write
    self.pcloud.file_write(fd=self.fd, data=b)
  File "/usr/local/lib/python3.7/site-packages/pcloud/validate.py", line 20, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pcloud/api.py", line 221, in file_write
    return self._upload('file_write', files, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pcloud/api.py", line 112, in _upload
    data=kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 581, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

From what I undestand from the above, what I'm getting back is basically header: X-Error: 1007, which is 1007 Invalid or closed file descriptor. https://docs.pcloud.com/methods/fileops/file_write.html

The problem I think is that pycloud doesn't wait for the HTTP call to /file_write https://github.com/tomgross/pycloud/blob/6d7429ff91e021183e2a5131d09ef5735e0f1085/src/pcloud/pcloudfs.py#L51 but instead it tries to close the file descriptor immediately with /file_close

This instead is what I get if I try to upload a text file that contains only "HelloWorld", where I can clearly see the urllib3.connectionpool to /file_write:

DEBUG:pycloud:Doing request to https://api.pcloud.com/getdigest
DEBUG:pycloud:Params: {}
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.pcloud.com:443
send: b'GET /getdigest HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: CloudHTTPd-API v1.1
header: Date: Wed, 27 Mar 2019 20:45:17 GMT
header: Content-Type: application/json; charset=utf-8
header: Content-Length: 138
header: ETag: "***"
header: Cache-Control: private, max-age=0
header: Vary: Accept-Encoding
header: Connection: keep-alive
header: Keep-Alive: timeout=1800
DEBUG:urllib3.connectionpool:https://api.pcloud.com:443 "GET /getdigest HTTP/1.1" 200 138
DEBUG:pycloud:Doing request to https://api.pcloud.com/userinfo
DEBUG:pycloud:Params: {'getauth': 1, 'logout': 1, 'username': '***', 'digest': '***', 'passworddigest': '***'}
send: b'GET /userinfo?getauth=1&logout=1&username=***&digest=***&passworddigest=*** HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: CloudHTTPd-API v1.1
header: Date: Wed, 27 Mar 2019 20:45:17 GMT
header: Content-Type: application/json; charset=utf-8
header: Content-Length: 744
header: ETag: "***"
header: Cache-Control: private, max-age=0
header: Vary: Accept-Encoding
header: Connection: keep-alive
header: Keep-Alive: timeout=1800
DEBUG:urllib3.connectionpool:https://api.pcloud.com:443 "GET /userinfo?getauth=1&logout=1&username=***&digest=***&passworddigest=*** HTTP/1.1" 200 744
DEBUG:pycloud:Doing request to https://api.pcloud.com/file_open
DEBUG:pycloud:Params: {'auth': '***', 'path': '/fail2ban.sqlite3', 'flags': 64}
send: b'GET /file_open?auth=***&path=%2Ffail2ban.sqlite3&flags=64 HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: CloudHTTPd-API v1.1
header: Date: Wed, 27 Mar 2019 20:45:17 GMT
header: Content-Type: application/json; charset=utf-8
header: Content-Length: 50
header: ETag: "***"
header: Cache-Control: private, max-age=0
header: Vary: Accept-Encoding
header: Connection: keep-alive
header: Keep-Alive: timeout=1800
DEBUG:urllib3.connectionpool:https://api.pcloud.com:443 "GET /file_open?auth=***&path=%2Ffail2ban.sqlite3&flags=64 HTTP/1.1" 200 50
send: b'POST /file_write HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Length: 464\r\nContent-Type: multipart/form-data; boundary=***\r\n\r\n'
send: b'--***\r\nContent-Disposition: form-data; name="fd"\r\n\r\n1\r\n--***\r\nContent-Disposition: form-data; name="data"\r\n\r\nHelloWorld\n\r\n--***\r\nContent-Disposition: form-data; name="auth"\r\n\r\n***\r\n--***\r\nContent-Disposition: form-data; name="filename"; filename="filename"\r\n\r\nHelloWorld\n\r\n--***--\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: CloudHTTPd-API v1.1
header: Date: Wed, 27 Mar 2019 20:45:17 GMT
header: Content-Type: application/json; charset=utf-8
header: Content-Length: 30
header: ETag: "***"
header: Cache-Control: private, max-age=0
header: Vary: Accept-Encoding
header: Connection: keep-alive
header: Keep-Alive: timeout=1800
DEBUG:urllib3.connectionpool:https://api.pcloud.com:443 "POST /file_write HTTP/1.1" 200 30
DEBUG:pycloud:Doing request to https://api.pcloud.com/file_close
DEBUG:pycloud:Params: {'auth': '***', 'fd': 1}
send: b'GET /file_close?auth=***&fd=1 HTTP/1.1\r\nHost: api.pcloud.com\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: CloudHTTPd-API v1.1
header: Date: Wed, 27 Mar 2019 20:45:17 GMT
header: Content-Type: application/json; charset=utf-8
header: Content-Length: 16
header: ETag: "***"
header: Cache-Control: private, max-age=0
header: Vary: Accept-Encoding
header: Connection: keep-alive
header: Keep-Alive: timeout=1800
DEBUG:urllib3.connectionpool:https://api.pcloud.com:443 "GET /file_close?auth=***&fd=1 HTTP/1.1" 200 16
Completed

and as you can see the file is uploaded successfully. Have I understood correctly the problem? If so, would you be able to give me any hint on how I can make file_close to wait on the file_write?

System info: python:3.7 Docker container with Python 3.7.2 appdirs==1.4.3 certifi==2019.3.9 chardet==3.0.4 fs==2.4.4 idna==2.8 pcloud==1.0a6 pytz==2018.9 PyYAML==5.1 requests==2.21.0 six==1.12.0 urllib3==1.24.1

Thanks

Massimo

tomgross commented 5 years ago

Probably setting the timout in Session.get to a large value ir even None doed the trick.

blasterspike commented 5 years ago

Do you mean this session.get? https://github.com/tomgross/pycloud/blob/1b0d9de6521904dcdc3ca3295f544008065e9e4e/src/pcloud/api.py#L58

or this session.post? https://github.com/tomgross/pycloud/blob/1b0d9de6521904dcdc3ca3295f544008065e9e4e/src/pcloud/api.py#L109

Anyway, I have tried to add timeout=None to both and I get the same problem, meaning that I don't see the urllib3.connectionpool to /file_write.

tomgross commented 4 years ago

Should be fixed with https://github.com/tomgross/pycloud/commit/ccea6a79e9e9b6a15a8282fbdf2f2c66b613b392

@blasterspike Please comment or reopen, if you disagree.