flux-framework / flux-k8s

Project to manage Flux tasks needed to standardize kubernetes HPC scheduling interfaces
Apache License 2.0
22 stars 10 forks source link

test: adding permit to allow for sibling pod scheduling #74

Closed vsoch closed 5 months ago

vsoch commented 5 months ago

Problem: the submit of the first index works for more controlled lengths (e.g., lammps takes a while) but was having issues with really quick jobs. Solution: try restoring the queue that allows for enabling siblings pods so any group can be scheduled.

vsoch commented 5 months ago

Failure due to controller entry point change from 5 days ago. Hopefully won’t require a Kubernetes component version update. https://github.com/kubernetes-sigs/scheduler-plugins/commit/4d3d41c5f994c9c94b6a21dae306785cbc2df833

vsoch commented 5 months ago

I think when we merge https://github.com/flux-framework/fluxion-go/pull/8 that should update fluence go bindings to 1.21, and then we can attempt updating here. sig-scheduler plugins is at go 1.21 https://github.com/kubernetes-sigs/scheduler-plugins/blob/51d27b6e06b339bfa413d0415d80c86d01097b44/go.mod#L3.

vsoch commented 5 months ago

I'm going to try building with that branch. If it works, I'll merge there and update here, and if this passes we can merge into the other PR branch. That's a lot of "ifs" :laughing:

vsoch commented 5 months ago

Looks like the upstream is still a moving target! They bumped kubernetes now up to 1.29x. https://github.com/kubernetes-sigs/scheduler-plugins/commit/2d20310880323ae307312d4d4fdfa78c2267073c. It changed a few function signature, trying to figure that out now.