Open ivoloshin opened 7 months ago
so the service would know when to raise an alert that a job hasn't run
But what to do when that service itself crashes? We didn't receive the 'Job Down' alert, but with a crashed monitoring service and a non-received alert, that doesn't necessarily mean the job has run. It could be either or none, creating a false sense of security.
IMO at least one solution for this conundrum would be to have each job register itself and its schedule with the service (of course not twice) at application startup, possibly somewhere in the UseScheduler
action. As a job marshals, then, it reports its status to the service (Green/Yellow/Red, with appropriate verbiage).
The service would then send a daily health report, indicating the status of each registered job. After not too long at all of receiving these daily reports, the person responsible for overseeing all of this would surely notice one gone missing and investigate further. One could even designate multiple recipients for the report.
It's not perfect, but it's the best fix I've been able to come up with for the scenario you describe. Others may wish to offer improvements.
It’s a never ending possibility of services going down along the chain.
So, we use Uptime Kuma to monitor websites and services and it pushes notifications when stuff goes down.
So I can have jobs push status to a service and have Kuma ask it about each job or all jobs to see if any failed.
What I want it to also know if a job is delinquent. So, like you said, a job needs to know and register its schedule. And I can do that, of course, but I just did that with the scheduler. I don’t want to do it twice. I’m thinking about making an attribute describing the schedule and then a helper to use that attribute when registering with the scheduler and inside the job itself for reporting.
It would be better though for Invoke to take a context class of some sort that would have some info - job id, maybe, and some schedule info to figure out next time the job will be queued again.
It'd be helpful if when the job is executed it could tell when the next execution might be. Is it possible to do it with the current framework?
I'm trying to think through a status service of some sort where the job could report to that it started, finished and/or failed. Well, it would also be helpful to know if it's delinquent, so it would need to report when it ran and when is next expected run is, so the service would know when to raise an alert that a job hasn't run.
Thanks, Ilya