frmscoe / General-Issues

This repo exists to track current work and any issues within the FRMS CoE
2 stars 0 forks source link

Scaling dynamically on-demand using the same platform software composition #300

Open Justus-at-Tazama opened 8 months ago

Justus-at-Tazama commented 8 months ago

Story statement

As a [the beneficiary of this feature], I want [what does the beneficiary want to be able to do?], So that [what is the benefit or value of the feature?] And so that [list ALL the benefits, one at a time]

Acceptance criteria

  1. [How will we know that the feature is completely and correctly implemented?]
Justus-at-Tazama commented 8 months ago

Design Authority review

Attending

Jason Darmanovich Johan Foley Justus Ortlepp

Problem statement

Solution/Approach

JD: Important to establish sensible metrics and approach for the evaluation of different tools so that we can model the problem we're trying to solve during the testing. JD: Keda (https://keda.sh/) appears to have the functionality we are looking.

Next steps

  1. Figure out which metrics are available and how to interpret and act on changes
  2. Benchmark the Azure platform's time to respond to scaling demand requests
  3. How does Keda actually work? a. i.e. talking to the new node that has just been added? b. How long does it take to scale processors to the new node?
  4. Ensure the pod scheduler does not prematurely evict running pods in order to move them a. Disable Horizontal Pod Auto-scaler (HPA)?
  5. Test and document.
  6. Once methodology established, determine adequate mechanisms for scaling statefull services.