educational-technology-collective / morf

The MOOC Replication Framework (MORF)
MIT License
16 stars 7 forks source link

Parallelize over buckets, not within buckets, in API functions #38

Closed jpgard closed 6 years ago

jpgard commented 6 years ago

e.g. calls to multiprocessing.Pool() should be outside of all for loops. This functionality needs to be tested but will ensure that jobs within a specific bucket do not "hang" waiting for a single/few jobs to complete.

jpgard commented 6 years ago

Note that previous rationale for this was a different s3 connection was needed for each bucket; this isn't the case anymore.

jpgard commented 6 years ago

In order to guard against OSError and boto3 errors due to too many connections, this is marked as wontfix.