douban / pymesos

A pure python implementation of Mesos scheduler and executor
BSD 3-Clause "New" or "Revised" License
163 stars 88 forks source link

Exponential backoff for executors incompatible with default & max executor_reregistration_timeout #115

Closed mkomitee closed 5 years ago

mkomitee commented 5 years ago

A maximum interval of 300 seconds in the exponential backoff we use when reconnecting to agents, all but ensures that an executor will not be able to reregister quickly enough once an agent comes back online.

Since the default executor_reregistration_timeout is 2secs (and mesos doesn't allow us to increase it beyond 15secs), we probably need the maximum interval to be 1 second for executor reconnect attempts.

ariesdevil commented 5 years ago

@mkomitee We will check it soon.