kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
111.29k stars 39.72k forks source link

POD fails to attach correct sriov device on ungraceful node reboot #107928

Open rthakur-est opened 2 years ago

rthakur-est commented 2 years ago

What happened?

POD with sriov nic device attached to it fails to attach correct sriov device up on node is hard rebooted after volumes are attached to it. The node is a VM in openstack cloud provider environment and the PCI address of the sriov VF changes on node hard reboot when additional volumes are attached to the VM.

Moreover, the same scenario works with graceful node reboot.

This is seen in the logs: Warning FailedCreatePodSandBox 91s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a74bd117e5aba36e9edfed421360e78cc68799886d1dfd32f0888567bd611774": [ejiazeh-pcg/eric-pc-up-data-plane-5b6b49bd86-6457v:eric-pc-up-data-plane-net0]: error adding container to network "eric-pc-up-data-plane-net0": error with host device: lstat /sys/bus/pci/devices/0000:00:15.0: no such file or directory Normal AddedInterface 79s multus Add eth0 [192.168.242.212/32] from k8s-pod-network Normal AddedInterface 64s multus Add eth0 [192.168.242.208/32] from k8s-pod-network Warning FailedCreatePodSandBox 63s (x2 over 78s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b5e0dd33781642c85e343b14f9209a91980ebd99ed77baa595dffaf9c60ef62b": [ejiazeh-pcg/eric-pc-up-data-plane-5b6b49bd86-6457v:eric-pc-up-data-plane-net0]: error adding container to network "eric-pc-up-data-plane-net0": error with host device: lstat /sys/bus/pci/devices/0000:00:15.0: no such file or directory Normal SandboxChanged 50s (x14 over 3m41s) kubelet Pod sandbox changed, it will be killed and re-created. Normal AddedInterface 49s multus Add eth0 [192.168.242.196/32] from k8s-pod-network Normal AddedInterface 34s multus Add eth0 [192.168.242.217/32] from k8s-pod-network Normal AddedInterface 21s multus Add eth0 [192.168.242.230/32] from k8s-pod-network

PCI addresses on node before reboot: 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01) 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) 00:02.0 VGA compatible controller: Cirrus Logic GD 5446 00:03.0 Ethernet controller: Red Hat, Inc. Virtio network device 00:04.0 Ethernet controller: Red Hat, Inc. Virtio network device 00:05.0 Ethernet controller: Red Hat, Inc. Virtio network device 00:06.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:07.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:08.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:09.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:0a.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:0b.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:0c.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:0d.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:11.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:12.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:13.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:14.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:15.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:16.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:17.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon

PCI addresses on node after reboot: 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01) 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) 00:02.0 VGA compatible controller: Cirrus Logic GD 5446 00:03.0 Ethernet controller: Red Hat, Inc. Virtio network device 00:04.0 Ethernet controller: Red Hat, Inc. Virtio network device 00:05.0 Ethernet controller: Red Hat, Inc. Virtio network device 00:06.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:07.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:08.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:09.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:0a.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:0b.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:0c.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:0d.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:0e.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:0f.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:10.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:11.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] 00:12.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon 00:13.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:14.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:15.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:16.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:17.0 SCSI storage controller: Red Hat, Inc. Virtio block device

What did you expect to happen?

POD should attach correct sriov devices on ungraceful node reboot. Currently, it assumes that the pci address of the devices won't change.

How can we reproduce it (as minimally and precisely as possible)?

Create openstack VM with volumes and sriov VFs attached to it. Create pod with sriov device attached. Attach additional volumes to the VM and do a hard node reboot. Pod comes up with same pci address as before but pci address of the device has changed.

Logs

container-inspect-output.txt failing-pod-describe.txt kubelet-logs.txt pci-after-reboot.txt pci-before-reboot.txt pod manifest.yml.txt

Kubernetes version

```console $ kubectl version Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"49499222b0eb0349359881bea01d8d5bd78bf444", GitTreeState:"clean", BuildDate:"2021-12-14T12:50:25Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"49499222b0eb0349359881bea01d8d5bd78bf444", GitTreeState:"clean", BuildDate:"2021-12-14T12:41:40Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"} ```

Cloud provider

openstack

OS version

```console # On Linux: $ cat /etc/os-release NAME="SLES" VERSION="15-SP2" VERSION_ID="15.2" PRETTY_NAME="SUSE Linux Enterprise Server 15 SP2" ID="sles" ID_LIKE="suse" ANSI_COLOR="0;32" CPE_NAME="cpe:/o:suse:sles:15:sp2" $ uname -a # paste output here

Install tools

Container runtime (CRI) and and version (if applicable)

containerd

Related plugins (CNI, CSI, ...) and versions (if applicable)

rthakur-est commented 2 years ago

/sig node

ehashman commented 2 years ago

This bug doesn't seem to have enough details to reproduce or further investigate. You must include your the container runtime, full kubelet logs, and any relevant Kubernetes manifests along with clear steps to reproduce so we can help with your issue.

Once more details are provided, the bug will be accepted.

/triage needs-information

rthakur-est commented 2 years ago

I have attached the logs for this issue. Before node reboot, this was the device address attached to the pod - 00:07.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02) After node reboot, the address is assigned to block device - 00:07.0 SCSI storage controller: Red Hat, Inc. Virtio block device

But kubelet continues to attach 0000:00:07.0 as the device after reboot even though pci assignment has changed.

SergeyKanzhelev commented 2 years ago

/remove-triage needs-information

SergeyKanzhelev commented 2 years ago

/triage accepted /priority important-longterm

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

rthakur-est commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

vaibhav2107 commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

vaibhav2107 commented 1 year ago

/remove-lifecycle stale

swatisehgal commented 1 year ago

/cc

k8s-triage-robot commented 10 months ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot commented 7 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 6 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten