deepgram / self-hosted-resources

Official Deepgram resources for deploying Deepgram services in a self-hosted environment
https://developers.deepgram.com
ISC License
6 stars 4 forks source link

Autoscaling #11

Closed bd-g closed 2 months ago

bd-g commented 3 months ago

Proposed changes

Deepgram containers expose a number of metrics that can inform autoscaling of pods.

The proposed change is implementing node autoscaling for AWS in the Helm chart and associated documentation. (already implemented for GCP).

Then, configure pod autoscaling according to incoming traffic.

Context

Pod autoscaling can help increase capacity to meet demand, and enables spinning down capacity during times of low traffic, which enables cost savings.

Possible Implementation

Include Prometheus and the Prometheus adapter as chart dependencies, and ingest the Engine metrics as Kubernetes Custom Metrics. Use the Custom Metrics to configure a Horizontal Pod Autoscaler for API and Engine nodes (License Proxy does not need to scale with load).