For whatever reason, _find_clusters() will find a list of eligible clusters, but if the "best" cluster turns out not to be available because another job locked it, it doesn't try any of the others, instead calling _usable_clusters() repeatedly, which fires off more DescribeCluster and ListInstance* API calls.
This is extremely wasteful of API calls. We should try submitting our job to one of the other clusters in the list before trying again.
For whatever reason,
_find_clusters()
will find a list of eligible clusters, but if the "best" cluster turns out not to be available because another job locked it, it doesn't try any of the others, instead calling_usable_clusters()
repeatedly, which fires off moreDescribeCluster
andListInstance*
API calls.This is extremely wasteful of API calls. We should try submitting our job to one of the other clusters in the list before trying again.