flux-framework / flux-operator

Deploy a Flux MiniCluster to Kubernetes with the operator
https://flux-framework.org/flux-operator/
MIT License
31 stars 8 forks source link

Feedback about Flux Operator future service #45

Open vsoch opened 1 year ago

vsoch commented 1 year ago

If we can imagine a way for an HPC center to provision clusters (where each is owned by a user) via the Flux Operator, on demand for a user or group, we'd want control of instance types / sizes / costs, e.g.,

An ideal in my opinion would to be able to list the allowed instance types and max sizes, then have flux handle provisioning (on-demand or spot) on a per-job basis. It could use qos flags to decide whether to chain sequences on the same instances (to amortize provisioning costs) versus spreading (to minimize time to completion). I think these policies are possible with kubernetes (thus minimizing customization to any specific cloud provider, as with current solutions).

In thread here: https://hachyderm.io/@jedbrown/109396976059698506

Thanks @jedbrown!

vsoch commented 7 months ago

@jedbrown heads up that we are working on a similar use case with https://github.com/converged-computing/rainbow, although it doesn't necessarily have to be a flux operator owned cluster (but the experiments I'm prototyping today are all flux operator clusters, specifically on different node pools on a cloud). I can post more here when it's done.