GoogleCloudPlatform / guest-configs

Apache License 2.0
31 stars 41 forks source link

Fix NVMe local SSD PROGRAM REGEX #30

Closed herdlevar closed 3 years ago

herdlevar commented 3 years ago

GCE supports up to 24 NVMe local SSDs, but the regex in the PROGRAM field only looks for the last digit of the given string causing issues when there are >= 10 local SSDs. Changed REGEX to get the last number of the string instead to support the up to 24 local SSDs.

google-oss-robot commented 3 years ago

Hi @herdlevar. Thanks for your PR.

I'm waiting for a GoogleCloudPlatform member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
herdlevar commented 3 years ago

/ok-to-test

herdlevar commented 3 years ago

/assign @hopkiw

hopkiw commented 3 years ago

thanks @herdlevar this looks correct, only concern is about the unquoted use of single brackets. can you provide some testing data showing this working in practice?

herdlevar commented 3 years ago

Here are logs from two different instances.

Instance-1 has the change included in this pull request:

herdlevar@instance-1:~$ udevadm info --name /dev/nvme0n24 --query=all
P: /devices/pci0000:00/0000:00:04.0/nvme/nvme0/nvme0n24
N: nvme0n24
L: 0
S: disk/by-id/nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000018
S: disk/by-path/pci-0000:00:04.0-nvme-24
S: disk/by-id/google-local-nvme-ssd-23
S: disk/by-id/nvme-nvme_card_nvme_card
E: DEVPATH=/devices/pci0000:00/0000:00:04.0/nvme/nvme0/nvme0n24
E: DEVNAME=/dev/nvme0n24
E: DEVTYPE=disk
E: MAJOR=259
E: MINOR=23
E: SUBSYSTEM=block
E: USEC_INITIALIZED=1574748
E: ID_SERIAL_SHORT=local-nvme-ssd-23
E: ID_WWN=nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000018
E: ID_MODEL=nvme_card
E: ID_REVISION=2
E: ID_SERIAL=Google_EphemeralDisk_local-nvme-ssd-23
E: ID_PATH=pci-0000:00:04.0-nvme-24
E: ID_PATH_TAG=pci-0000_00_04_0-nvme-24
E: DEVLINKS=/dev/disk/by-id/nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000018 /dev/disk/by-path/pci-0000:00:04.0-nvme-24 /dev/disk/by-id/google-local-nv
me-ssd-23 /dev/disk/by-id/nvme-nvme_card_nvme_card
E: TAGS=:systemd:

You can see that the 24th namespace maps appropriately to disk/by-id/google-local-nvme-ssd-23 (0 based)

Here is instance-1's disks by id:

herdlevar@instance-1:~$ ls /dev/disk/by-id
google-instance-1         google-local-nvme-ssd-22                                       nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000b
google-instance-1-part1   google-local-nvme-ssd-23                                       nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000c
google-instance-1-part14  google-local-nvme-ssd-3                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000d
google-instance-1-part15  google-local-nvme-ssd-4                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000e
google-local-nvme-ssd-0   google-local-nvme-ssd-5                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000f
google-local-nvme-ssd-1   google-local-nvme-ssd-6                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000010
google-local-nvme-ssd-10  google-local-nvme-ssd-7                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000011
google-local-nvme-ssd-11  google-local-nvme-ssd-8                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000012
google-local-nvme-ssd-12  google-local-nvme-ssd-9                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000013
google-local-nvme-ssd-13  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000001  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000014
google-local-nvme-ssd-14  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000002  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000015
google-local-nvme-ssd-15  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000003  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000016
google-local-nvme-ssd-16  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000004  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000017
google-local-nvme-ssd-17  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000005  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000018
google-local-nvme-ssd-18  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000006  nvme-nvme_card_nvme_card
google-local-nvme-ssd-19  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000007  scsi-0Google_PersistentDisk_instance-1
google-local-nvme-ssd-2   nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000008  scsi-0Google_PersistentDisk_instance-1-part1
google-local-nvme-ssd-20  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000009  scsi-0Google_PersistentDisk_instance-1-part14
google-local-nvme-ssd-21  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000a  scsi-0Google_PersistentDisk_instance-1-part15

And here is instance-2 which has the currently deployed rules:

herdlevar@instance-2:~$ udevadm info --name /dev/nvme0n24 --query=all
P: /devices/pci0000:00/0000:00:04.0/nvme/nvme0/nvme0n24
N: nvme0n24
L: 0
S: disk/by-id/nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000018
S: disk/by-path/pci-0000:00:04.0-nvme-24
S: disk/by-id/google-local-nvme-ssd-1
S: disk/by-id/nvme-nvme_card_nvme_card
E: DEVPATH=/devices/pci0000:00/0000:00:04.0/nvme/nvme0/nvme0n24
E: DEVNAME=/dev/nvme0n24
E: DEVTYPE=disk
E: MAJOR=259
E: MINOR=23
E: SUBSYSTEM=block
E: USEC_INITIALIZED=1506290
E: ID_SERIAL_SHORT=local-nvme-ssd-1
E: ID_WWN=nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000018
E: ID_MODEL=nvme_card
E: ID_REVISION=2
E: ID_SERIAL=Google_EphemeralDisk_local-nvme-ssd-1
E: ID_PATH=pci-0000:00:04.0-nvme-24
E: ID_PATH_TAG=pci-0000_00_04_0-nvme-24
E: DEVLINKS=/dev/disk/by-id/nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000018 /dev/disk/by-path/pci-0000:00:04.0-nvme-24 /dev/disk/by-id/google-local-nv
me-ssd-1 /dev/disk/by-id/nvme-nvme_card_nvme_card
E: TAGS=:systemd:

you can see that namespace 24 maps incorrectly to /disk/by-id/google-local-nvme-ssd-1

Here is instance-2's disk by ids:

google-instance-2                                              nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000009
google-instance-2-part1                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000a
google-instance-2-part14                                       nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000b
google-instance-2-part15                                       nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000c
google-local-nvme-ssd-0                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000d
google-local-nvme-ssd-1                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000e
google-local-nvme-ssd-2                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-0000000f
google-local-nvme-ssd-3                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000010
google-local-nvme-ssd-4                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000011
google-local-nvme-ssd-5                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000012
google-local-nvme-ssd-6                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000013
google-local-nvme-ssd-7                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000014
google-local-nvme-ssd-8                                        nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000015
nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000001  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000016
nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000002  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000017
nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000003  nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000018
nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000004  nvme-nvme_card_nvme_card
nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000005  scsi-0Google_PersistentDisk_instance-2
nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000006  scsi-0Google_PersistentDisk_instance-2-part1
nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000007  scsi-0Google_PersistentDisk_instance-2-part14
nvme-nvme.1ae0-6e766d655f63617264-6e766d655f63617264-00000008  scsi-0Google_PersistentDisk_instance-2-part15

Not all 24 local SSDs appear here.

Instance-1's 24 local SSDs Screen Shot 2021-10-19 at 2 30 34 PM

Instance-2's 24 local SSDs Screen Shot 2021-10-19 at 2 30 01 PM

Screenshot of the 24 local SSDs for each instance attached.

herdlevar commented 3 years ago

Gentle ping =)

hopkiw commented 3 years ago

/lgtm

hopkiw commented 3 years ago

/approve

google-oss-robot commented 3 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: herdlevar, hopkiw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/GoogleCloudPlatform/guest-configs/blob/master/OWNERS)~~ [hopkiw] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment