al-niessner / DAWGIE

Other
3 stars 1 forks source link

stuck active workers #204

Closed al-niessner closed 1 year ago

al-niessner commented 1 year ago

Containers vanish without sending a signal to the scheduler that they have vanished which in turn blocks the scheduler. Can the loss be detected by the foreman? If so, the foreman should forget that worker (fire them) and do the job again.

al-niessner commented 1 year ago

Seems that adding --init to the docker run of the workers has alleviated if not fixed the problem.