Open rmelick-vida opened 3 years ago
Hi Russel, You can use /admin/events/queueSize & /admin/impressions/queueSize, if the size is growing then the synchronizer is not catching up to the incoming impressions/events. This is where the lambda value is calculated in admin dashboard. You can send this value to Datadog.
Thanks Bilal
Hi @chillaq I know about those APIs, I just don't want to write a custom script or process to pull from those APIs and then send to datadog.
The admin dashboard, and the metrics apis available (like /admin/metrics) are interesting to help us investigate the current status synchronizer, but they are not very useful for monitoring the system over time. They don't have a historical view of things like queue size or lambdas, and it also requires you to remember to check that dashboard.
It would be good to have a way to pipe these metrics out of the synchronizer, and into a tool designed for monitoring and alerting, such as Datadog or Prometheus. We would like to monitor the queue size and trigger alerts based on queue size or bad lambda values.
I can think of a few good options for this, what do others think?