Catch exceptions encountered when getting agent resources from mesos (caused by masters restarting or anything else) and shutdown scale if exceptions continue to occur for 30 minutes. We may want to pause the scheduler and/or raise an error instead of shutting down.
Checklist
manage.py test
passesAffected app(s)
Description of change
Catch exceptions encountered when getting agent resources from mesos (caused by masters restarting or anything else) and shutdown scale if exceptions continue to occur for 30 minutes. We may want to pause the scheduler and/or raise an error instead of shutting down.