Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.93k stars 295 forks source link

[BUG] Multiple disks are having same h/w path (ID_PATH) #3328

Open trishnendu opened 1 year ago

trishnendu commented 1 year ago

Describe the bug On AKS worker nodes with Ubuntu 18.04, multiple disks are having same h/w path (ID_PATH). This is preventing to uniquely identify the disks properly.

To Reproduce Steps to reproduce the behavior:

  1. Create AKS cluster with Ubuntu 18.04
  2. Attach disk to any of the worker node (either by attaching a PVC from K8s or directly attaching a disk to host)
  3. Check the h/w path (/dev/disk/by-path/*)

Expected behavior All the disks should have unique hardware paths

Screenshots Here sdc and sda are getting same hardware paths. ubuntu@dl-ubuntu:~# ls -lhtr /dev/disk/by-path total 0 lrwxrwxrwx 1 root root 11 Nov 10 07:37 acpi-VMBUS:00-scsi-0:0:0:0-part15 -> ../../sda15 lrwxrwxrwx 1 root root 10 Nov 10 07:37 acpi-VMBUS:00-scsi-0:0:0:0-part1 -> ../../sda1 lrwxrwxrwx 1 root root 11 Nov 10 07:37 acpi-VMBUS:00-scsi-0:0:0:0-part14 -> ../../sda14 lrwxrwxrwx 1 root root 9 Nov 10 07:37 acpi-VMBUS:00-scsi-0:0:0:1 -> ../../sdb lrwxrwxrwx 1 root root 9 Nov 10 07:39 acpi-VMBUS:00-scsi-0:0:0:0 -> ../../sdc

If we check ID_PATH of corresponding udev, they are same ubuntu@dl-ubuntu:/$ udevadm info --name=/dev/sda | grep ID_PATH E: ID_PATH=acpi-VMBUS:00-scsi-0:0:0:0 E: ID_PATH_TAG=acpi-VMBUS_00-scsi-0_0_0_0 ubuntu@dl-ubuntu:/$ udevadm info --name=/dev/sdc | grep ID_PATH E: ID_PATH=acpi-VMBUS:00-scsi-0:0:0:0 E: ID_PATH_TAG=acpi-VMBUS_00-scsi-0_0_0_0

Environment (please complete the following information):

Additional context From our analysis, we found that AKS Ubuntu 18.04 is still using systemd v237. This version has a known bug which is causing this and was fixed in later version of systemd. We want to have updated version of systemd in Ubuntu VM image that is being used for AKS version 1.23

andyzhangx commented 1 year ago

@trishnendu we are leveraging /dev/disk/azure/scsi1 and lun num to differentiate multiple disks, e.g. https://github.com/andyzhangx/demo/blob/master/linux/azuredisk/azuredisk-attachment-debugging.md#3-log-on-agent-node-and-check-device-info, that's how current azure disk driver works, could you use same way?

trishnendu commented 1 year ago

@andyzhangx thanks for the quick response! Alongside Ubuntu, we support multiple other Linux distros as well. And we are not seeing this issue in those platforms with higher systemd package version. Since it is known bug in a particular systemd/udev version and fix is already available (https://github.com/systemd/systemd/pull/8509), we will prefer to get corresponding package updated instead of doing special handling for azure-ubuntu 18.04 only.

andyzhangx commented 1 year ago

@trishnendu since AKS already supports 1.25 with Ubuntu 22.04, do you know whether 22.04 has fixed that issue?

trishnendu commented 1 year ago

@andyzhangx yes, all systemd packages > v239 has the fix. So h/w paths are getting created correctly on Ubuntu 22.04 (running with systemd v249 out of the box). But going to latest Ubuntu version is not feasible for us.

ghost commented 1 year ago

Thanks for reaching out. I'm closing this issue as it was marked with "Answer Provided" and it hasn't had activity for 2 days.

trishnendu commented 1 year ago

@andyzhangx , can we please re-open this issue? We want to track this for 18.04 version

ghost commented 1 year ago

Action required from @Azure/aks-pm

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

justindavies commented 1 year ago

@andyzhangx - do we know if Canonical is going to fix this in 18.04?

andyzhangx commented 1 year ago

@andyzhangx - do we know if Canonical is going to fix this in 18.04?

@alexeldeib should be more familiar with this fix porting, and 18.04 is EOL in April this year FYI, workloads should be moved to AKS 1.25 with Ubuntu 22.04.

andyzhangx commented 1 year ago

btw, current 18.04 is already on 5.4.0-1101-azure, but it's still on systemd 237, that fix is still not ported.

# systemctl --version
systemd 237
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid
ghost commented 1 year ago

Action required from @Azure/aks-pm

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 11 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 5 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 5 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 4 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 4 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 3 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 3 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 2 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 2 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 1 month ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 1 month ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 2 weeks ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 3 days ago

Issue needing attention of @Azure/aks-leads