It's possible that this commit caused a regression. Should test that the server properly cycles its workers after being suspended and see if there is a bug there.
It may also be a good idea to cycle out workers more regularly, e.g. every 15 minutes. It would be good if we had a way to cycle workers while ensuring that there is at least one good worker in the queue at all times, so there is no noticeable "downtime" while workers are cycling.
This was fixed previously in https://github.com/alda-lang/alda/issues/160, but the issue seems to have come back, or perhaps was never completely fixed to begin with.
It's possible that this commit caused a regression. Should test that the server properly cycles its workers after being suspended and see if there is a bug there.
It may also be a good idea to cycle out workers more regularly, e.g. every 15 minutes. It would be good if we had a way to cycle workers while ensuring that there is at least one good worker in the queue at all times, so there is no noticeable "downtime" while workers are cycling.