Azure / aksArc

# Welcome to the Azure Kubernetes Service on Azure Stack HCI repo This is where the AKS-HCI team will track features and issues with AKS-HCI. We will monitor this repo in order to engage with our community and discuss questions, customer scenarios, or feature requests. Checkout our projects tab to see the roadmap for AKS-HCI!
MIT License
109 stars 45 forks source link

Important: GPU-enabled node pools in the October 2022 preview update #272

Open baziwane opened 1 year ago

baziwane commented 1 year ago

Customers deploying the October 2022 GPU preview update will notice that the node pool will fail to start after creating a workload cluster.

Symptoms

Root cause

Workaround

  1. SSH into each worker node
  2. Run sudo su
  3. Run tdnf install nvidia-container-toolkit-base

Recommendations

  1. Install the September 2022 update if you want to run GPU enabled node pools
  2. For customers already on September 2022 GPU preview or earlier, we recommend delaying upgrading until a fix is made available in an upcoming release.
  3. Use the workaround below if you are already running on the October 2022 GPU preview update.