nabicht / SimpleTaskQueue

A simple task queue used for coordinating distributed, parallel work.
4 stars 0 forks source link

Runner should reconnect (and re-try last attempted message) #66

Closed nabicht closed 5 years ago

nabicht commented 6 years ago

If runner can't connect to the server it currently just quits with error and what it is trying to do never gets done. Instead it should:

  1. wait a period of time and attempt to reconnect. This might eventually be configurable but to start I can hard code something in there (both period of time and attempts, wouldn't mind reconnect time getting longer the more fails in a row there are)
  2. resend the request it was trying to send when it initially failed.
nabicht commented 5 years ago

waits 1 - 60 seconds, doubling each time until getting to 60 (once it gets above 60 it takes the min of the doubling and 60).

Tries to reconnect/execute 50 times before simply quitting with a good error message.