Closed timolegros closed 1 month ago
None of the StatsD metrics for scheduler jobs ever made it to Datadog because none of them have ever run successfully. Thus this ticket is blocked by both #8052 and #8053.
Still blocked until the next deployment when the health checks + rollbar logs will finally be made available for scheduler jobs.
Blocked until release v1.4.1. Moving back to teed-up
For some reason, the stats are still not making it do Datadog. This may be similar to the Rollbar flush issue in that the stat requests are not flushed before the scheduled job exits though hot-shots
is supposed to flush before exiting:
So I checked and the statsD metric is being emitted locally and the outbox-archival
script is not logging any errors in production (only success logs) so technically the metric should be emitted. Since there is no obvious reason for this issue I propose 2 possible solutions:
The first proposed solution is implemented in #8631.
I'm still not getting the gauge metrics on Datadog and I can't figure out why. At this point, this issue does not warrant any more of my time. Rollbar errors will be reported and logs show up in Datadog as well. It is my recommendation that we transition to use graphile-worker instead of Heroku scheduler. This will give us much more flexibility and visibility.
Description
Now that Heroku scheduler jobs emit a heartbeat, we can create Datadog monitors that ensure those jobs are running as scheduled.