Yelp / mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services
http://packages.python.org/mrjob/
Other
2.61k stars 587 forks source link

pooling should use any usable cluster before looking for more #2164

Closed coyotemarin closed 4 years ago

coyotemarin commented 4 years ago

For whatever reason, _find_clusters() will find a list of eligible clusters, but if the "best" cluster turns out not to be available because another job locked it, it doesn't try any of the others, instead calling _usable_clusters() repeatedly, which fires off more DescribeCluster and ListInstance* API calls.

This is extremely wasteful of API calls. We should try submitting our job to one of the other clusters in the list before trying again.

coyotemarin commented 4 years ago

We should still double-check that these clusters are in the WAITING state before joining them.

Since we now have to describe clusters as part of locking them, we can include that as part of locking semantics.