Open sgerogia opened 3 years ago
Using opencensus
for collecting service metrics (https://github.com/census-instrumentation/opencensus-go)
ℹ️ Please add/suggest more metrics that you think is important for a good monitoring system.
[ ] Average Response time
[ ] Application Metrics
[ ] Time per rest request (sampling at some %)
[ ] Number of request at an interval
[ ] Platform Metrics
[ ] Average response time of data provider(s)
[ ] Average response time of database
If we have time, and if it's a low hanging fruit, monitor the metrics of the pod (memory, cpu percentage, network etc.)
binance
for price of a random whitelisted token.binance
and coingecko
with that same token,
b. Average the price received from these 2 sources
c. If price is still diverged by more than 5%, signal error. Host | Pros | Cons |
---|---|---|
AWS Lambda | 1. Low maintenance, 2. Free tier available, 3. SLA by cloud provider, 4. Triggering alerts are easy via cloud watch | 1. One extra infra to manage |
Our Kubernetes | 1. It's our, so have builtin monitoring, 2. One less infra to manage, We can use same grafana dashboard as price oracle for monitoring this service | 1. It's our infra, it it's not being 100% blackbox |
If we deploy a new pod responsible for this task, are we sure that it will go through nginx (to detect cache problems) and not connect directly to the price-oracle
pod ?
If we deploy a new pod ... are we sure that it will go through nginx (to detect cache problems) ...
Not sure. But since we will query the endpoint exposed by the service, it theoretically should go through nginx. But again not sure.
It is hard for the Price Oracle to self-verify the accuracy of its ticks. Create an external service which will act as a "canary" for stale ticks.
DoD