rancher / dashboard

The Rancher UI
https://rancher.com
Apache License 2.0
450 stars 256 forks source link

[backport v2.9.next1] Add warning banner to allocate the number of nodes + 1 vGPUs #11517

Open github-actions[bot] opened 1 month ago

github-actions[bot] commented 1 month ago

This is a backport issue for #10989, automatically created via GitHub Actions workflow initiated by @gaktive

Original issue body:

Setup Rancher version:v2.8-head Browser type & version: Chrome Version 124.0.6367.78 Harvester Version: v1.3.0

To Reproduce

  1. Set up vGPU profiles (multiple) in Harvester
  2. Import Harvester into Rancher
  3. Go to Virtualization management -> Harvester UI for cluster -> vGPU Devices and enable a vGPU with 2 allocatable.
  4. From Cluster Management, Create a new 2-node RKE2 cluster with Harvester as the downstream provider. Under Advanced options, add the vGPU with 2 allocatable resources ( same number as the cluster nodes)
  5. After the creation of the cluster is completed, edit the config file of the cluster
  6. Observe the logs of the failed process of provisioning.

Result Once the harvester cluster is redeployed for any reason (the user edits the config, the nodes go into an error state, etc), the new VMs spin up before the old ones are completely shut down, which causes the "un-schedulable" error as the vGPUs won't be available yet.

Expected Result We could add a warning banner in the UI to recommend that the user should provision N+1 allocatable vgpu, where N is number of nodes.

gaktive commented 1 month ago

This might be unblocked soon as Harvester QA works on this. @ibrokethecloud can provide an environment upon request to help out so someone please check ASAP.

gaktive commented 1 month ago

We got word from Harvester that this is OK to push to the next release. @rebeccazzzz will keep UI posted on when Harvester will do their part to unblock us.