WELLlabs / JaltolAPI

MIT License
0 stars 0 forks source link

Utilization metrics #22

Open bprashanth opened 2 months ago

bprashanth commented 2 months ago
anmolsingh0219 commented 2 months ago

Monitor CPU, RAM, and Disk Usage:

We can create CloudWatch Alarms for key metrics:

Auto-Scaling:

DNS Failover with Route 53:

anmolsingh0219 commented 2 months ago

1. CloudWatch Metrics

Free Tier:

Additional Costs:

2. CloudWatch Logs

Free Tier:

Additional Costs:

3. CloudWatch Alarms

Free Tier:

Additional Costs:

4. CloudWatch Dashboards

Free Tier:

Additional Costs:

5. CloudWatch Events

Free Tier:

Additional Costs:


Example Cost Calculation:

Assume the following usage scenario:

Cost Breakdown:

Monthly Costs Estimate:

Based on the free tier usage:

bprashanth commented 2 months ago

From on call discussions

  1. Resource/utilization metrics:

    • implement metrics/alerts for disk and ram
    • look up best practices for cpu metrics/alerting: we don't want very frequent cpu alerts, we want to know if cpu starvation is happening on a regular basis so we can bump up the cpu
  2. Custom metrics:

    • check if we can record these via aws free tier
    • if yes, what client libs can we use in our python code to send the timer stats to aws
anmolsingh0219 commented 2 months ago

@anmolsingh0219 @bprashanth To research about setting up custom metric on AWS for RAM and Disk

bprashanth commented 2 months ago

We want to same metrics we have for cpu (an alert if it crosses the 80% threshold) for ram and disk.

There should be a pre canned way to achieve this. If not we'll just have to do a custom metrics.

We also want a read latency RDS metric. We're assuming if the read latency spikes we can increase the RDS instance size to bring it back down. We don't think write latency matters much since our users will not be writing to RDS (most writes are when we ingest data).