Yelp / mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services
http://packages.python.org/mrjob/
Other
2.62k stars 586 forks source link

add pool_jitter_seconds option #2200

Closed coyotemarin closed 3 years ago

coyotemarin commented 4 years ago

If this is set, along with max_clusters_in_pool we should wait a random amount of time between 0 and this many seconds and double-check that no other clusters were created before launching a cluster of our own. This way, if a bunch of jobs launch simultaneously, we can make it unlikely that they will all create a cluster at the same time.

coyotemarin commented 4 years ago

This applies to bailing out early from pool_wait_minutes as well; if we think there are no clusters to wait for, we should double-check before launching our own.