aws-samples / awsome-inference

MIT No Attribution
31 stars 10 forks source link

NIM Autoscaling + Load Balancing + Custom Metrics Monitoring #9

Closed amanshanbhag closed 2 months ago

amanshanbhag commented 3 months ago

Issue #, if available:

Description of changes: This PR contains all the code required to get set up with:

  1. Custom values NIMs - Deploying with ebs csi installed
  2. Monitoring - Custom metric scraping with Prometheus
  3. Scaling - Using HPA based on scraped custom metrics
  4. Load Balancing - Ingress of type ALB

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.