bartongroup / slivka

http://bartongroup.github.io/slivka/
Apache License 2.0
7 stars 3 forks source link

When whe local queue is unavailable, the requests pile up in the zmq queue. #75

Closed warownia1 closed 4 years ago

warownia1 commented 4 years ago

Whenever a job submission is retried, the new message is pushed to the zmq socket. When the local queue server comes back it fetches the messages from all failed attempts and tries to run them. As a result, the local-queue schedules jobs which will never be retrieved and fail on start as the working directory is not set up.

Suggestions:

warownia1 commented 4 years ago

The queue may also fill up on the client-side. Consider using the combination of NOBLOCK with send_json and RCVTIMEO with recv_json so that they'll fail with Again: Resource temporarily unavailable instead of blocking forever.