DistributedScience / Distributed-Something

Run encapsulated docker containers that do... something in the Amazon Web Services infrastructure.
https://distributedscience.github.io/Distributed-Something
Other
7 stars 3 forks source link

Print current fleet size in monitor? #18

Closed bethac07 closed 1 year ago

bethac07 commented 1 year ago

Right now, monitor prints only info about the SQS queue - namely, the number of messages in progress and available.

Especially for long-running and/or large-machine-requiring tasks, though, it might be nice to print the number of instances you have and # instances requested in the monitor as well - such tasks are more likely to have a machine killed mid-run, but are harder to realize that it happened, since one would not expect the number of jobs in progress to change much (sometimes for hours). This won't bring your machine back, obviously, but might alert the user that they should try to do something (use a different machine config, use a different subnet, reboot the Dockers and/or resubmit the jobs) to expedite matters.