Closed pwang7 closed 10 years ago
I opened an issue here on Mesos for this... MESOS-1462. Hopefully it'll get fixed in the next release!
This is really an issue in the external containerizer code. It's not ideal, but restarting the slave makes the failed task disappear in the UI.
If deimos fails to run a job (for instance, can't find the image to run), deimos won't remove the failed job from mesos, but leaves the job with staging status, and the resources allocated to this job won't be released.