Closed joeyparrish closed 5 years ago
The cloud upload node is much slower than gsutil, since gsutil -m rsync can parallelize much of the work.
gsutil -m rsync
Using threading in Python is also hairy, since the global interpreter lock may cause issues. (It's not yet clear if GIL is part of the problem.)
Should we find a way to improve performance using the python libraries? Or rely on gsutil, which is already optimized?
We've decided to move to using gsutil. This has many advantages, not only for performance, but also in simpler authentication to cloud storage.
The cloud upload node is much slower than gsutil, since
gsutil -m rsync
can parallelize much of the work.Using threading in Python is also hairy, since the global interpreter lock may cause issues. (It's not yet clear if GIL is part of the problem.)
Should we find a way to improve performance using the python libraries? Or rely on gsutil, which is already optimized?