irods / irods_capability_automated_ingest

Other
12 stars 16 forks source link

Investigate `parallel_upload_from_S3` chunking logic #273

Open alanking opened 1 month ago

alanking commented 1 month ago

https://github.com/irods/irods_capability_automated_ingest/blob/52f98357cf4c3cdaaea56756629efabe638cf321/irods_capability_automated_ingest/scanner.py#L385-L393

The use of range in the chunk_ranges list is a little bit confusing.

The ranges are used to determine the number of threads to use and the byte ranges to use for each of those threads so that they can be dispatched in a loop: https://github.com/irods/irods_capability_automated_ingest/blob/52f98357cf4c3cdaaea56756629efabe638cf321/irods_capability_automated_ingest/scanner.py#L405-L407

Please investigate to see whether this can be improved for clarity.

Created because of this review comment: https://github.com/irods/irods_capability_automated_ingest/pull/267#discussion_r1760964437