d2iq-archive / marathon

Deploy and manage containers (including Docker) on top of Apache Mesos at scale.
https://mesosphere.github.io/marathon/
Apache License 2.0
4.07k stars 843 forks source link

Properly release lock on all scale checks #7255

Open Lqp1 opened 4 years ago

Lqp1 commented 4 years ago

Scale checks are done regularly, to validate that running instances number matches expected number of instances in runSpec.

There are two cases:

In second case, lock is supposed to be released only when overdue instances are dead. We encountered issues where lock was never released because KillStreamWatKillStreamWatcher.watchForKilledTasks() future was never ending. This is because of a typo in the code, making it wait for ALL instances to die, instead of just the overdue subset.

What do you think?