Closed jslouisyou closed 1 month ago
Hi, I could pull Helm chart like:
helm pull oci://ghcr.io/k8snetworkplumbingwg/sriov-network-operator-chart --version 1.4.0
and it seems that image tags for all containers are all set.
images:
operator: ghcr.io/k8snetworkplumbingwg/sriov-network-operator:v1.4.0
sriovConfigDaemon: ghcr.io/k8snetworkplumbingwg/sriov-network-operator-config-daemon:v1.4.0
sriovCni: ghcr.io/k8snetworkplumbingwg/sriov-cni:v2.8.1
ibSriovCni: ghcr.io/k8snetworkplumbingwg/ib-sriov-cni:v1.1.1
ovsCni: ghcr.io/k8snetworkplumbingwg/ovs-cni-plugin:v0.34.2
rdmaCni: ghcr.io/k8snetworkplumbingwg/rdma-cni:v1.2.0
sriovDevicePlugin: ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin:v3.7.0
resourcesInjector: ghcr.io/k8snetworkplumbingwg/network-resources-injector:v1.6.0
webhook: ghcr.io/k8snetworkplumbingwg/sriov-network-operator-webhook:v1.4.0
metricsExporter: ghcr.io/k8snetworkplumbingwg/sriov-network-metrics-exporter:v1.1.0
metricsExporterKubeRbacProxy: gcr.io/kubebuilder/kube-rbac-proxy:v0.15.0
I think I can use this images for v1.4.0. Could you please confirm that above images are setup correctly?
hi @jslouisyou , I think the point here is that we can no longer deploy a tag by checking out the source code.
Since the helm package is deployed when tagging, I think having the helm pull ...
command is enough for the job.
Images look correct to me.
Are you experiencing any other issue during the deploy?
Hi @zeeke , I'm facing an issue while creating VFs in v1.4.0 version - IB devices disappears at the end of VF creation (It works in v1.3.0 btw). First of all, I think below comment is quite different from this thread so please let me know if I need to create another issue then.
I used same configuration (e.g. SriovNetworkNodePolicy
) for creating VFs.
Here's SriovNetworkNodePolicy
that I used:
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-gpu2-ib2
namespace: sriov-network-operator
spec:
isRdma: true
linkType: ib
nicSelector:
deviceID: "1021"
pfNames:
- ibp157s0
vendor: 15b3
nodeSelector:
node-role.kubernetes.io/gpu: ""
numVfs: 8
priority: 10
resourceName: gpu2_mlnx_ib2
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-gpu2-ib3
namespace: sriov-network-operator
spec:
isRdma: true
linkType: ib
nicSelector:
deviceID: "1021"
pfNames:
- ibp211s0
vendor: 15b3
nodeSelector:
node-role.kubernetes.io/gpu: ""
numVfs: 8
priority: 10
resourceName: gpu2_mlnx_ib3
And I'm using H100 node with ConnectX-7 IB:
$ mst status -v
MST modules:
------------
MST PCI module is not loaded
MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE MST PCI RDMA NET NUMA
ConnectX7(rev:0) /dev/mst/mt4129_pciconf5 e5:00.0 mlx5_5 net-ibp229s0 1
ConnectX7(rev:0) /dev/mst/mt4129_pciconf4 d3:00.0 mlx5_4 net-ibp211s0 1
ConnectX7(rev:0) /dev/mst/mt4129_pciconf3 c1:00.0 mlx5_3 net-ibp193s0 1
ConnectX7(rev:0) /dev/mst/mt4129_pciconf2 9d:00.0 mlx5_2 net-ibp157s0 1
ConnectX7(rev:0) /dev/mst/mt4129_pciconf1 54:00.0 mlx5_1 net-ibp84s0 0
ConnectX7(rev:0) /dev/mst/mt4129_pciconf0 41:00.0 mlx5_0 net-ibp65s0 0
$ lspci -s 41:00.0 -vvn
41:00.0 0207: 15b3:1021
Subsystem: 15b3:0041
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 18
NUMA node: 0
Region 0: Memory at 23e044000000 (64-bit, prefetchable) [size=32M]
Expansion ROM at <ignored> [disabled]
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 25.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 32GT/s, Width x16, ASPM not supported
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 32GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABC, TimeoutDis+ NROPrPrP- LTR-
10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit+ 64bit+ 128bitCAS+
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn+
LnkCap2: Supported Link Speeds: 2.5-32GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
LnkCtl2: Target Link Speed: 32GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [48] Vital Product Data
Product Name: Nvidia ConnectX-7 Single Port Infiniband NDR OSFP Adapter
Read-only fields:
[PN] Part number: 0RYMTY
[EC] Engineering changes: A02
[MN] Manufacture ID: 1028
[SN] Serial number: IN0RYMTYJBNM43BRJ4KF
[VA] Vendor specific: DSV1028VPDR.VER2.1
[VB] Vendor specific: FFV28.39.10.02
[VC] Vendor specific: NPY1
[VD] Vendor specific: PMTD
[VE] Vendor specific: NMVNvidia, Inc.
[VH] Vendor specific: L1D0
[VU] Vendor specific: IN0RYMTYJBNM43BRJ4KFMLNXS0D0F0
[RV] Reserved: checksum good, 0 byte(s) reserved
End
Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
Vector table: BAR=0 offset=00002000
PBA: BAR=0 offset=00003000
Capabilities: [c0] Vendor Specific Information: Len=18 <?>
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP+ FCP+ CmpltTO+ CmpltAbrt+ UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ AdvNonFatalErr+
AERCap: First Error Pointer: 04, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [1c0 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [320 v1] Lane Margining at the Receiver <?>
Capabilities: [370 v1] Physical Layer 16.0 GT/s <?>
Capabilities: [3b0 v1] Extended Capability ID 0x2a
Capabilities: [420 v1] Data Link Feature <?>
Kernel driver in use: mlx5_core
Kernel modules: mlx5_core
And I pulled v1.3.0 and v1.4.0 Helm charts from oci://ghcr.io/k8snetworkplumbingwg/sriov-network-operator-chart
and image tags are different:
v1.3.0
images:
operator: ghcr.io/k8snetworkplumbingwg/sriov-network-operator:v1.3.0
sriovConfigDaemon: ghcr.io/k8snetworkplumbingwg/sriov-network-operator-config-daemon:v1.3.0
sriovCni: ghcr.io/k8snetworkplumbingwg/sriov-cni:v2.8.0
ibSriovCni: ghcr.io/k8snetworkplumbingwg/ib-sriov-cni:v1.1.1
ovsCni: ghcr.io/k8snetworkplumbingwg/ovs-cni-plugin:v0.34.0
sriovDevicePlugin: ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin:v3.7.0
resourcesInjector: ghcr.io/k8snetworkplumbingwg/network-resources-injector:v1.6.0
webhook: ghcr.io/k8snetworkplumbingwg/sriov-network-operator-webhook:v1.3.0
v1.4.0
images:
operator: ghcr.io/k8snetworkplumbingwg/sriov-network-operator:v1.4.0
sriovConfigDaemon: ghcr.io/k8snetworkplumbingwg/sriov-network-operator-config-daemon:v1.4.0
sriovCni: ghcr.io/k8snetworkplumbingwg/sriov-cni:v2.8.1
ibSriovCni: ghcr.io/k8snetworkplumbingwg/ib-sriov-cni:v1.1.1
ovsCni: ghcr.io/k8snetworkplumbingwg/ovs-cni-plugin:v0.34.2
rdmaCni: ghcr.io/k8snetworkplumbingwg/rdma-cni:v1.2.0
sriovDevicePlugin: ghcr.io/k8snetworkplumbingwg/sriov-network-device-plugin:v3.7.0
resourcesInjector: ghcr.io/k8snetworkplumbingwg/network-resources-injector:v1.6.0
webhook: ghcr.io/k8snetworkplumbingwg/sriov-network-operator-webhook:v1.4.0
metricsExporter: ghcr.io/k8snetworkplumbingwg/sriov-network-metrics-exporter:v1.1.0
metricsExporterKubeRbacProxy: gcr.io/kubebuilder/kube-rbac-proxy:v0.15.0
As you know, sriov-device-plugin
pods are creating when SriovNetworkNodePolicy
deployed.
After then, my H100 nodes' status are changed from sriovnetwork.openshift.io/state: Idle
to sriovnetwork.openshift.io/state: Reboot_Required
and rebooted after elapsed some time.
But in v1.4.0, it seems that VFs were created but eventually these were not shown and even PF disappeared. Here's the logs from dmesg
:
[ 115.692158] pci 0000:41:00.1: [15b3:101e] type 00 class 0x020700
[ 115.692321] pci 0000:41:00.1: enabling Extended Tags
[ 115.694112] mlx5_core 0000:41:00.1: enabling device (0000 -> 0002)
[ 115.694789] mlx5_core 0000:41:00.1: firmware version: 28.39.1002
[ 115.867939] mlx5_core 0000:41:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 115.867943] mlx5_core 0000:41:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 115.892812] pci 0000:41:00.2: [15b3:101e] type 00 class 0x020700
[ 115.892967] pci 0000:41:00.2: enabling Extended Tags
[ 115.894706] mlx5_core 0000:41:00.2: enabling device (0000 -> 0002)
[ 115.895344] mlx5_core 0000:41:00.2: firmware version: 28.39.1002
[ 115.895423] mlx5_core 0000:41:00.1 ibp65s0v0: renamed from ib0
[ 116.065557] mlx5_core 0000:41:00.2: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 116.065561] mlx5_core 0000:41:00.2: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 116.090478] pci 0000:41:00.3: [15b3:101e] type 00 class 0x020700
[ 116.090634] pci 0000:41:00.3: enabling Extended Tags
[ 116.093559] mlx5_core 0000:41:00.3: enabling device (0000 -> 0002)
[ 116.093993] mlx5_core 0000:41:00.2 ibp65s0v1: renamed from ib0
[ 116.094189] mlx5_core 0000:41:00.3: firmware version: 28.39.1002
[ 116.293582] mlx5_core 0000:41:00.3: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 116.293587] mlx5_core 0000:41:00.3: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 116.318209] pci 0000:41:00.4: [15b3:101e] type 00 class 0x020700
[ 116.318368] pci 0000:41:00.4: enabling Extended Tags
[ 116.320079] mlx5_core 0000:41:00.4: enabling device (0000 -> 0002)
[ 116.320712] mlx5_core 0000:41:00.4: firmware version: 28.39.1002
[ 116.320871] mlx5_core 0000:41:00.3 ibp65s0v2: renamed from ib0
.....
[ 446.036867] mlx5_core 0000:41:01.0 ibp65s0v7: renamed from ib0
[ 446.464555] mlx5_core 0000:41:00.0: mlx5_wait_for_pages:898:(pid 6868): Skipping wait for vf pages stage
[ 448.848149] mlx5_core 0000:41:00.0: driver left SR-IOV enabled after remove <----------- weird
[ 449.108562] mlx5_core 0000:41:00.2: poll_health:955:(pid 0): Fatal error 3 detected
[ 449.108602] mlx5_core 0000:41:00.4: poll_health:955:(pid 0): Fatal error 3 detected
[ 449.108620] mlx5_core 0000:41:00.2: mlx5_health_try_recover:375:(pid 1478): handling bad device here
[ 449.108627] mlx5_core 0000:41:00.2: mlx5_handle_bad_state:326:(pid 1478): starting teardown
[ 449.108629] mlx5_core 0000:41:00.2: mlx5_error_sw_reset:277:(pid 1478): start
[ 449.108646] mlx5_core 0000:41:00.4: mlx5_health_try_recover:375:(pid 2283): handling bad device here
[ 449.108660] mlx5_core 0000:41:00.4: mlx5_handle_bad_state:326:(pid 2283): starting teardown
[ 449.108661] mlx5_core 0000:41:00.4: mlx5_error_sw_reset:277:(pid 2283): start
[ 449.108672] mlx5_core 0000:41:00.2: mlx5_error_sw_reset:310:(pid 1478): end
[ 449.108694] mlx5_core 0000:41:00.4: mlx5_error_sw_reset:310:(pid 2283): end
[ 449.876577] mlx5_core 0000:41:00.5: poll_health:955:(pid 0): Fatal error 3 detected
[ 449.876642] mlx5_core 0000:41:00.5: mlx5_health_try_recover:375:(pid 1000): handling bad device here
[ 449.876649] mlx5_core 0000:41:00.5: mlx5_handle_bad_state:326:(pid 1000): starting teardown
[ 449.876651] mlx5_core 0000:41:00.5: mlx5_error_sw_reset:277:(pid 1000): start
[ 449.877266] mlx5_core 0000:41:00.5: mlx5_error_sw_reset:310:(pid 1000): end
[ 450.381036] mlx5_core 0000:41:00.2: mlx5_health_try_recover:381:(pid 1478): starting health recovery flow
After then, when I tried to execute mst status -v
then even the node can't find PF itself:
$ mst status -v
MST modules:
------------
MST PCI module is not loaded
MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE MST PCI RDMA NET NUMA
ConnectX7(rev:0) /dev/mst/mt4129_pciconf5 e5:00.0 mlx5_5 net-ibp229s0 1
ConnectX7(rev:0) /dev/mst/mt4129_pciconf4 d3:00.0 1 <---- it goes empty
ConnectX7(rev:0) /dev/mst/mt4129_pciconf3 c1:00.0 mlx5_3 net-ibp193s0 1
ConnectX7(rev:0) /dev/mst/mt4129_pciconf2 9d:00.0 1 <---- it goes empty
ConnectX7(rev:0) /dev/mst/mt4129_pciconf1 54:00.0 mlx5_1 net-ibp84s0 0
ConnectX7(rev:0) /dev/mst/mt4129_pciconf0 41:00.0 mlx5_0 net-ibp65s0 0
Do you know anything about this situation? Anything would be very helpful.
Thanks.
Yes, please pack all this information in a new issue. It will help other users to better find the information
Thanks @zeeke . I'll wrap this up and raise a new issue then.
Hello,
When I try to deploy
sriov-network-operator
latest version (e.g.v1.4.0
), it seems thatsriov-network-operator
createssriov-network-config-daemon
Pods accordingly but I can find that all images within that Pod useslatest
tag (Tag is omitted but I heard that if this tag has been omitted thenlatest
will be used by default, please let me know if I knew wrong).I can find that image tags aren't assigned inside the script:
Helm chart https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/9dbf2b1b5fd1ff7e836aff169b8aabf020a2840e/deployment/sriov-network-operator-chart/values.yaml#L104-L115
Shell script https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/9dbf2b1b5fd1ff7e836aff169b8aabf020a2840e/hack/env.sh#L1-L14
But in v1.2.0, image tags were set: https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/815fd134ba8000756791051fca60179ec66ddb46/hack/env.sh#L1-L20
In this case, Is it intended to use
latest
image for all containers? If not, could you please proper tags for all images?Thanks.