Sage-Bionetworks / synapsePythonClient

Programmatic interface to Synapse services for Python
https://www.synapse.org
Apache License 2.0
67 stars 68 forks source link

DRAFT: [SYNPY-1420] Re-write uploads to mix async, multi-threading, and multi-processing #1077

Closed BryanFauble closed 6 months ago

BryanFauble commented 6 months ago

Problem:

  1. The current upload process was slow as pieces that could be executed concurrently were executed sequentially.
  2. Items like MD5 calculation are CPU bound which blocks the Python GIL from doing anything else.

Solution:

  1. Part uploads are pushed to other threads
  2. MD5 checksum calculations are pushed to other processes (To prevent blocking the main AsyncIO Thread)
  3. All HTTP calls are using AsyncIO async/await syntax to prevent any blocking while we are waiting for network I/O

Testing: Some initial testing figures:

10 files, 100MB/file, 1GB total: Before: 33s After: 16s

1 file, 10GB: Before: 221s - Average ~73.5 MB/s After: 164s - Average ~124 MB/s

pep8speaks commented 6 months ago

Hello @BryanFauble! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 272:89: E501 line too long (98 > 88 characters) Line 273:89: E501 line too long (100 > 88 characters)

Line 2:89: E501 line too long (128 > 88 characters) Line 25:89: E501 line too long (99 > 88 characters) Line 27:89: E501 line too long (107 > 88 characters) Line 41:89: E501 line too long (118 > 88 characters) Line 50:89: E501 line too long (95 > 88 characters) Line 311:89: E501 line too long (105 > 88 characters) Line 336:89: E501 line too long (92 > 88 characters) Line 400:89: E501 line too long (110 > 88 characters) Line 615:89: E501 line too long (90 > 88 characters) Line 616:89: E501 line too long (96 > 88 characters) Line 618:89: E501 line too long (102 > 88 characters) Line 746:89: E501 line too long (140 > 88 characters) Line 747:89: E501 line too long (98 > 88 characters) Line 751:89: E501 line too long (101 > 88 characters) Line 754:89: E501 line too long (104 > 88 characters) Line 757:89: E501 line too long (106 > 88 characters)

Line 38:89: E501 line too long (116 > 88 characters) Line 39:89: E501 line too long (116 > 88 characters) Line 86:89: E501 line too long (118 > 88 characters) Line 312:89: E501 line too long (89 > 88 characters) Line 314:89: E501 line too long (102 > 88 characters)

Line 88:89: E501 line too long (110 > 88 characters) Line 111:89: E501 line too long (110 > 88 characters)

Line 482:89: E501 line too long (91 > 88 characters) Line 1085:89: E501 line too long (95 > 88 characters)

Line 69:89: E501 line too long (101 > 88 characters)

Line 514:89: E501 line too long (120 > 88 characters) Line 593:89: E501 line too long (94 > 88 characters)

Line 488:89: E501 line too long (120 > 88 characters) Line 562:89: E501 line too long (94 > 88 characters)

Comment last updated at 2024-03-14 22:03:05 UTC