HubSpot / Singularity

Scheduler (HTTP API and webapp) for running Mesos tasks—long running processes, one-off tasks, and scheduled jobs. #hubspot-open-source
http://getsingularity.com/
Apache License 2.0
822 stars 188 forks source link

Address missing task data #2274

Closed ssalinas closed 2 years ago

ssalinas commented 2 years ago

We are seeing some NPEs come through here. Adding more logging to see what is missing on the task definition here

ssalinas commented 2 years ago

update here, found some additional pieces. Based on the logs and the fact that a regular task bounce (which uses getTask vs getTasks) succeeds. We can say the task data is in the zkCache, but not in zk itself. So, as a bandaid, this attempts to repair the task if we see that case, before continuing on to the methods that would throw exceptions. The overall flow should remain the same if the task is still not present