Backblaze / B2_Command_Line_Tool

The command-line tool that gives easy access to all of the capabilities of B2 Cloud Storage
Other
543 stars 123 forks source link

Read timed out during download sync #720

Open jondoe1337 opened 3 years ago

jondoe1337 commented 3 years ago

Trying to download around 7TB divided in 13 files, with about 1Gbps, however I had to restart it 3 times already because after a while it crashes with the following exception and ALL downloads are lost:

Traceback (most recent call last):
  File "urllib3/response.py", line 438, in _error_catcher
  File "urllib3/response.py", line 519, in read
  File "http/client.py", line 455, in read
  File "http/client.py", line 499, in readinto
  File "socket.py", line 704, in readinto
  File "ssl.py", line 1241, in recv_into
  File "ssl.py", line 1099, in read
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "requests/models.py", line 753, in generate
  File "urllib3/response.py", line 576, in stream
  File "urllib3/response.py", line 541, in read
  File "contextlib.py", line 135, in __exit__
  File "urllib3/response.py", line 443, in _error_catcher
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='f003.backblazeb2.com', port=443): Read timed out.

I just used did b2 sync --threads 4 b://mybucket/.../ ./

Is it possible to increase the timeout, or just retry? ;-)

ppolewicz commented 3 years ago

The current version of b2sdk has http timeouts set to 15 minutes (since b2_copy_file can sometimes take a long time). I think you should try again :)

jondoe1337 commented 3 years ago

Current version means? Head of the main branch? We're using 2.5.0 r n.

ppolewicz commented 3 years ago

This setting is in b2-sdk-python, actually. Which version of b2-sdk-python do you have? Even before the timeout was, I think, 10 minutes. Is this issue persistent or did it happen once? Sometimes the server can get stalled maybe.

By the way, is this the full stack trace? This exception should be caught by downloader and the download should continue from the place where it broke off. Maybe we are not catching it properly?

jondoe1337 commented 3 years ago

We downloaded the pre-build linux release with wget https://github.com/Backblaze/B2_Command_Line_Tool/releases/latest/download/b2-linux.

It happens when syncing with more than one thread.

The affected host is at hetzner (a EX62 with hardware raid 10).

Sorry I guess this is the rest of the stack trace (it looked similar thats why I truncated it):

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "threading.py", line 954, in _bootstrap_inner
  File "b2sdk/transfer/inbound/downloader/parallel.py", line 324, in run
  File "requests/models.py", line 760, in generate
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='f003.backblazeb2.com', port=443): Read timed out.
Exception in thread Thread-54:d: 0/14 files   0.14 / 6.97 TB   117 MB/s
Traceback (most recent call last):
  File "urllib3/response.py", line 438, in _error_catcher
  File "urllib3/response.py", line 519, in read
  File "http/client.py", line 455, in read
  File "http/client.py", line 499, in readinto
  File "socket.py", line 704, in readinto
  File "ssl.py", line 1241, in recv_into
  File "ssl.py", line 1099, in read
socket.timeout: The read operation timed out
ppolewicz commented 3 years ago

I think I know what is causing it, but I need to investigate. Please use one thread for now - in your case it will not actually be a single thread because you have large files which create their own threads.

jondoe1337 commented 3 years ago

I already did that and I can confirm that 1 threads works for my scenario. 👍

ppolewicz commented 3 years ago

The way threads are configurable for sync from cloud to local is not ideal. We will change it in the future, but for now you should use 1 or 2 for --threads in this type of environment.