GoogleCloudPlatform / gsutil

A command line tool for interacting with cloud storage services.
Apache License 2.0
875 stars 334 forks source link

Connection error when emptying bucket (using -m) #1053

Open nilslice opened 4 years ago

nilslice commented 4 years ago

Although I cannot consistently reproduce this issue, running gsutil -m rm -r gs://${BUCKET}/ resulted in:

Exception in thread Thread-5:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/managers.py", line 811, in _callmethod
    conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/usr/local/bin/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2348, in run
    cls = copy.copy(class_map[caller_id])
  File "<string>", line 2, in __getitem__
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/managers.py", line 815, in _callmethod
    self._connect()
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/managers.py", line 802, in _connect
    conn = self._Client(self._token.address, authkey=self._authkey)
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 492, in Client
    c = SocketClient(address)
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 619, in SocketClient
    s.connect(address)
ConnectionRefusedError: [Errno 61] Connection refused

The bucket had approximately 300 objects, and eventually all but one removed -- one object was left dangling. This error did not block the removal process from running, but I figured I would report it either way.

System:

sempi commented 4 years ago

The same happens regularly when we upload with gsutil -m cp ...

jgrobbel commented 4 years ago

I see this very frequently, would be great to get a fix or some retry mechanism. 99% of the copy/delete works but then hangs waiting on the busted thread.

gsutil --version
gsutil version: 4.53

I have just seen this happen without the -m flag:

| [133 files][  5.7 GiB/  5.7 GiB]   37.3 KiB/s
==> NOTE: You are performing a sequence of gsutil operations that may
run significantly faster if you instead use gsutil -m cp ... Please
see the -m section under "gsutil help options" for further information
about when gsutil -m can be advantageous.

Operation completed over 133 objects/5.7 GiB.
Exception in thread Thread-4:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/managers.py", line 811, in _callmethod
    conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2348, in run
    cls = copy.copy(class_map[caller_id])
  File "<string>", line 2, in __getitem__
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/managers.py", line 815, in _callmethod
    self._connect()
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/managers.py", line 802, in _connect
    conn = self._Client(self._token.address, authkey=self._authkey)
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 492, in Client
    c = SocketClient(address)
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 619, in SocketClient
    s.connect(address)
ConnectionRefusedError: [Errno 61] Connection refused
rrauber commented 3 years ago

Thanks for reporting this! #1100 and this issue look similar. I was able to get around this by disabling multiprocessing, which can be done by setting parallel_process_count=1 in the GSUtil section of your boto config file, or by adding the following flag to your command: -o "GSUtil:parallel_process_count=1". Though this disables multiprocessing, multithreading should still be enabled, so you'll still be able to parallelize your transfers.

If you're still having this issue after disabling multiprocessing please let us know!

kevinrosa commented 2 years ago

Thanks @rrauber, disabling multiprocessing did the trick for me.

Background:
I had been running gsutil rsync fine on one Mac but then switched to a different machine and it stopped working.

New command I ran:
gsutil -m -o "GSUtil:parallel_process_count=1" rsync -r -x ".DS_Store" {src_url} {dst_url}

System info: macOS 10.15.5 gsutil version: 5.4

stfines-clgx commented 2 years ago

I can confirm that using "GSUtil:parallel_process_count=1" seems to work on gsutil rm

gadotroee commented 2 years ago

I can confirm that override those settings made it working for me as well. after changing parallel_process_count parallel_thread_count to lower number, gsutil command works pretty well.

I used gsutil -o 'GSUtil:parallel_process_count=5' -o 'GSUtil:parallel_thread_count=5' (can be changed in the boto file as well)