Closed abbasahmed closed 1 year ago
Great work! Can you add a small description on the PR? Thanks!
Hi @dgkanatsios, I've added a description of the PR. Also a heads up, we are currently still adding some code changes so currently the PR is in draft status.
hey @abbasahmed, let me know how I can help to land this PR! Appreciate all the hard work, thanks!
@abbasahmed @ghov kind ping, we want to release 0.6 next week and it would be great to include your changes!
@dgkanatsios will resolve the conflicts and publish the PR today. Sorry for the hold up on this PR!
thanks @abbasahmed, let me know if you need any help!
Problem: We currently do not have a lot of metrics around Allocations API service in Thundernetes. Reference: Issue #384
Solution:
This PR adds in 6 metrics:
These new metrics can help us monitor the allocation service in ways such as monitoring the performance of allocation service in terms of speed and reliability. The metrics allow us to quickly monitor the errors of allocation service helping us to make quicker inferences/decisions.
Along with the metrics, we have also added a couple of panels in the Grafana dashboard to visualize these events.