# Welcome to the Azure Kubernetes Service on Azure Stack HCI repo This is where the AKS-HCI team will track features and issues with AKS-HCI. We will monitor this repo in order to engage with our community and discuss questions, customer scenarios, or feature requests. Checkout our projects tab to see the roadmap for AKS-HCI!
MIT License
109
stars
45
forks
source link
[BUG] AksHci upgrade hangs when the upgrade is initiated after adding a physical node to the setup #288
Describe the bug
AksHci upgrade hangs if we initiate the upgrade after adding a new physical to an existing AksHci setup. The hang typically looks as shown in the screenshot below.
The CSI controller pod logs has the following error.
To Reproduce
Steps to reproduce the behavior:
Install AksHci
Add a new physical node
Perform AksHci upgrade with Update-AksHci. The upgrade hangs during this step
Expected behavior
Ideally, the upgrade should complete without any issues.
Mitigation
Drain the node using failover cluster UI as shown below.
Alternatively, you can use the command Suspend-ClusterNode -Name <nodename> -Drain to drain the node.
Use Remove-AksHciNode -nodeName <nodeName> to remove the machine from akshci setup
Use Remove-ClusterNode -Name <nodeName> to remove the machine from failover cluster
Run Update-AksHci to trigger the upgrade.
Note: We can remove the node while the upgrade is hanging. The upgrade will proceed.
Describe the bug AksHci upgrade hangs if we initiate the upgrade after adding a new physical to an existing AksHci setup. The hang typically looks as shown in the screenshot below.![upgrade_hang](https://user-images.githubusercontent.com/176106/217862911-8aa2334a-1574-4e60-b43e-1d4e28afab58.png)
The CSI controller pod logs has the following error.![csi_logs](https://user-images.githubusercontent.com/176106/217861842-d4ab3cfa-a423-4271-9856-24409f494082.png)
To Reproduce Steps to reproduce the behavior:
Update-AksHci
. The upgrade hangs during this stepExpected behavior Ideally, the upgrade should complete without any issues.
Mitigation
Suspend-ClusterNode -Name <nodename> -Drain
to drain the node.Remove-AksHciNode -nodeName <nodeName>
to remove the machine from akshci setupRemove-ClusterNode -Name <nodeName>
to remove the machine from failover clusterUpdate-AksHci
to trigger the upgrade.Note: We can remove the node while the upgrade is hanging. The upgrade will proceed.