networkservicemesh / deployments-k8s

Apache License 2.0
41 stars 34 forks source link

about ovs-forward #10775

Open 316953425 opened 7 months ago

316953425 commented 7 months ago

hi @glazychev-art I deploy https://github.com/networkservicemesh/deployments-k8s/tree/release/v1.11.1/examples/ovs failed. My k8s environment has three master nodes. I suspect it is my configuration file /var/lib/networkservicemesh/smartnic.config have some problems. /var/lib/networkservicemesh/smartnic.config is the same on each master node, and each master node has the same network card configuration.

The content is as follows:

physicalFunctions:
  0000:3d:00.0:               # 1. PCI address of SR-IOV capable NIC that we want to use with Forwarder
    pfKernelDriver: i40e # 2. PF kernel driver
    vfKernelDriver: i40evf # 3. VF kernel driver
    capabilities:             # 4. List of capabilities
      - intel
      - 1G
    serviceDomains:           # 5. List of service domains
      - worker.domain
[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# ethtool -i ens5
driver: i40e
version: 2.3.2-k
firmware-version: 4.10 0x80001a63 1.2585.0
expansion-rom-version:
bus-info: 0000:3d:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
[root@CNCP-MS-01 webhook-smartvf]# echo 1 > /sys/class/net/ens5/device/sriov_numvfs
[root@CNCP-MS-01 webhook-smartvf]# ls -l /sys/class/net/ens5/device/virtfn0/driver
lrwxrwxrwx 1 root root 0 12月 19 15:11 /sys/class/net/ens5/device/virtfn0/driver -> ../../../../../../bus/pci/drivers/i40evf
[root@CNCP-MS-01 webhook-smartvf]# echo 0 > /sys/class/net/ens5/device/sriov_numvfs
[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# lspci | grep net
19:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
3d:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:00.1 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:00.2 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:00.3 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
5e:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

pod error:

[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# kubectl describe pods forwarder-ovs-dp8qz -n nsm-system
Name:         forwarder-ovs-dp8qz
Namespace:    nsm-system
Priority:     0
Node:         cncp-ms-01/172.16.102.11
Start Time:   Tue, 19 Dec 2023 15:28:28 +0800
Labels:       app=forwarder-ovs
              controller-revision-hash=54bd6fd788
              pod-template-generation=1
              spiffe.io/spiffe-id=true
Annotations:  <none>
Status:       Running
IP:           172.16.102.11
IPs:
  IP:           172.16.102.11
  IP:           2408:8631:c02:ffa2::b
Controlled By:  DaemonSet/forwarder-ovs
Containers:
  forwarder-ovs:
    Container ID:   docker://1166ee7c58184c8c60d69e2631b2dd819bac73b77b49343b80baef61cbe5a0de
    Image:          ghcr.io/networkservicemesh/cmd-forwarder-ovs:v1.11.1
    Image ID:       docker-pullable://ghcr.io/networkservicemesh/cmd-forwarder-ovs@sha256:007ac7b3d9e07b5ab400f338056dfdcd1e65a245725a00aea6549bec8ea26412
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 19 Dec 2023 15:30:07 +0800
      Finished:     Tue, 19 Dec 2023 15:30:08 +0800
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 19 Dec 2023 15:29:20 +0800
      Finished:     Tue, 19 Dec 2023 15:29:22 +0800
    Ready:          False
    Restart Count:  4
    Limits:
      memory:  1Gi
    Requests:
      memory:  1Gi
    Environment:
      NSM_NAME:                forwarder-ovs-dp8qz (v1:metadata.name)
      SPIFFE_ENDPOINT_SOCKET:  unix:///run/spire/sockets/agent.sock
      NSM_CONNECT_TO:          unix:///var/lib/networkservicemesh/nsm.io.sock
      NSM_SRIOV_CONFIG_FILE:   /var/lib/networkservicemesh/smartnic.config
      NSM_LOG_LEVEL:           INFO
      NSM_BRIDGE_NAME:         br-nsm
      NSM_TUNNEL_IP:            (v1:status.podIP)
    Mounts:
      /host/dev/vfio from vfio (rw)
      /host/sys/fs/cgroup from cgroup (rw)
      /run/spire/sockets from spire-agent-socket (ro)
      /var/lib/kubelet from kubelet-socket (rw)
      /var/lib/networkservicemesh from nsm (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8qm98 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  spire-agent-socket:
    Type:          HostPath (bare host directory volume)
    Path:          /run/spire/sockets
    HostPathType:  Directory
  nsm:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/networkservicemesh
    HostPathType:  Directory
  kubelet-socket:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet
    HostPathType:  Directory
  cgroup:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/fs/cgroup
    HostPathType:  Directory
  vfio:
    Type:          HostPath (bare host directory volume)
    Path:          /dev/vfio
    HostPathType:  DirectoryOrCreate
  kube-api-access-8qm98:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  108s                default-scheduler  Successfully assigned nsm-system/forwarder-ovs-dp8qz to cncp-ms-01
  Normal   Pulled     10s (x5 over 107s)  kubelet            Container image "ghcr.io/networkservicemesh/cmd-forwarder-ovs:v1.11.1" already present on machine
  Normal   Created    10s (x5 over 106s)  kubelet            Created container forwarder-ovs
  Normal   Started    9s (x5 over 106s)   kubelet            Started container forwarder-ovs
  Warning  BackOff    7s (x7 over 97s)    kubelet            Back-off restarting failed container

Could you tell me how to fix it ,thanks

glazychev-art commented 7 months ago

@ljkiraly Could you please take a look if you have a chance?

ljkiraly commented 7 months ago

Hi @316953425 , Is that a new configuration? Has that worked before in this or another environment? Could you try without specifying the service domain? Please attach the forwarder logs to the issue. (kubectl logs forwarder-ovs-dp8qz -n nsm-system)

316953425 commented 7 months ago

Hi @316953425 , Is that a new configuration? Has that worked before in this or another environment? Could you try without specifying the service domain? Please attach the forwarder logs to the issue. (kubectl logs forwarder-ovs-dp8qz -n nsm-system)

hi @glazychev-art @ljkiraly

Is that a new configuration? Has that worked before in this or another environment?

I am using the configuration of version 1.11.1(https://github.com/networkservicemesh/deployments-k8s/tree/release/v1.11.1/examples/ovs) without any changes. This is my first installation and I have never installed it before.

Could you try without specifying the service domain?

My install step(use v1.11.1 version): 1、install spire:kubectl apply -k https://github.com/networkservicemesh/deployments-k8s/examples/spire/single_cluster?ref=v1.11.1 2、kubectl apply -f https://raw.githubusercontent.com/networkservicemesh/deployments-k8s/v1.11.1/examples/spire/single_cluster/clusterspiffeid-template.yaml 3、install ovs:kubectl apply -k https://github.com/networkservicemesh/deployments-k8s/examples/ovs?ref=v1.11.1 without specifying the service domain

cat /var/lib/networkservicemesh/smartnic.config

physicalFunctions:
  0000:3d:00.0:
    pfKernelDriver: i40e
    vfKernelDriver: i40evf
    capabilities:
      - intel
      - 1G

Please attach the forwarder logs to the issue. (kubectl logs forwarder-ovs-dp8qz -n nsm-system)

[root@CNCP-MS-01 tmp]# kubectl logs forwarder-ovs-vlbt9 -n nsm-system
I1220 01:28:03.231538   15220 ovs.go:98] Maximum command line arguments set to: 191102
Dec 20 01:28:03.232 [INFO] Setting env variable DLV_LISTEN_FORWARDER to a valid dlv '--listen' value will cause the dlv debugger to execute this binary and listen as directed.
2023/12/20 01:28:03 [INFO]  there are 5 phases which will be executed followed by a success message:
2023/12/20 01:28:03 [INFO]  the phases include:
2023/12/20 01:28:03 [INFO]  1: get config from environment
2023/12/20 01:28:03 [INFO]  2: ensure ovs is running
2023/12/20 01:28:03 [INFO]  3: retrieve spiffe svid
2023/12/20 01:28:03 [INFO]  4: create ovs forwarder network service endpoint
2023/12/20 01:28:03 [INFO]  5: create grpc server and register ovsxconnect
2023/12/20 01:28:03 [INFO]  6: register ovs forwarder network service with the registry
2023/12/20 01:28:03 [INFO]  a final success message with start time duration
2023/12/20 01:28:03 [INFO]  executing phase 1: get config from environment (time since start: 65.658µs)
This application is configured via the environment. The following environment
variables can be used:

KEY                              TYPE                                           DEFAULT                                                                               REQUIRED    DESCRIPTION
NSM_NAME                         String                                         forwarder                                                                                         Name of Endpoint
NSM_LABELS                       Comma-separated list of String:String pairs    p2p:true                                                                                          Labels related to this forwarder-vpp instance
NSM_NSNAME                       String                                         forwarder                                                                                         Name of Network Service to Register with Registry
NSM_BRIDGENAME                   String                                         br-nsm                                                                                            Name of the OvS bridge
NSM_TUNNEL_IP                    String                                                                                                                                           IP or CIDR to use for tunnels
NSM_CONNECT_TO                   URL                                            unix:///connect.to.socket                                                                         url to connect to
NSM_DIAL_TIMEOUT                 Duration                                       50ms                                                                                              Timeout for the dial the next endpoint
NSM_MAX_TOKEN_LIFETIME           Duration                                       24h                                                                                               maximum lifetime of tokens
NSM_REGISTRY_CLIENT_POLICIES     Comma-separated list of String                 etc/nsm/opa/common/.*.rego,etc/nsm/opa/registry/.*.rego,etc/nsm/opa/client/.*.rego                paths to files and directories that contain registry client policies
NSM_RESOURCE_POLL_TIMEOUT        Duration                                       30s                                                                                               device plugin polling timeout
NSM_DEVICE_PLUGIN_PATH           String                                         /var/lib/kubelet/device-plugins/                                                                  path to the device plugin directory
NSM_POD_RESOURCES_PATH           String                                         /var/lib/kubelet/pod-resources/                                                                   path to the pod resources directory
NSM_SRIOV_CONFIG_FILE            String                                         pci.config                                                                                        PCI resources config path
NSM_L2_RESOURCE_SELECTOR_FILE    String                                                                                                                                           config file for resource to label matching
NSM_PCI_DEVICES_PATH             String                                         /sys/bus/pci/devices                                                                              path to the PCI devices directory
NSM_PCI_DRIVERS_PATH             String                                         /sys/bus/pci/drivers                                                                              path to the PCI drivers directory
NSM_CGROUP_PATH                  String                                         /host/sys/fs/cgroup/devices                                                                       path to the host cgroup directory
NSM_VFIO_PATH                    String                                         /host/dev/vfio                                                                                    path to the host VFIO directory
NSM_LOG_LEVEL                    String                                         INFO                                                                                              Log level
NSM_OPENTELEMETRYENDPOINT        String                                         otel-collector.observability.svc.cluster.local:4317                                               OpenTelemetry Collector Endpoint
NSM_METRICS_EXPORT_INTERVAL      Duration                                       10s                                                                                               interval between mertics exports
2023/12/20 01:28:03 [INFO]  Config: &main.Config{Name:"forwarder-ovs-vlbt9", Labels:map[string]string{"p2p":"true"}, NSName:"forwarder", BridgeName:"br-nsm", TunnelIP:"172.16.102.11", ConnectTo:url.URL{Scheme:"unix", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/var/lib/networkservicemesh/nsm.io.sock", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}, DialTimeout:50000000, MaxTokenLifetime:86400000000000, RegistryClientPolicies:[]string{"etc/nsm/opa/common/.*.rego", "etc/nsm/opa/registry/.*.rego", "etc/nsm/opa/client/.*.rego"}, ResourcePollTimeout:30000000000, DevicePluginPath:"/var/lib/kubelet/device-plugins/", PodResourcesPath:"/var/lib/kubelet/pod-resources/", SRIOVConfigFile:"/var/lib/networkservicemesh/smartnic.config", L2ResourceSelectorFile:"", PCIDevicesPath:"/sys/bus/pci/devices", PCIDriversPath:"/sys/bus/pci/drivers", CgroupPath:"/host/sys/fs/cgroup/devices", VFIOPath:"/host/dev/vfio", LogLevel:"INFO", OpenTelemetryEndpoint:"otel-collector.observability.svc.cluster.local:4317", MetricsExportInterval:10000000000}
2023/12/20 01:28:03 [INFO]  [duration:2.665071ms] completed phase 1: get config from environment
2023/12/20 01:28:03 [INFO]  executing phase 2: ensure ovs is running (time since start: 2.758614ms)
2023/12/20 01:28:04 [INFO]  local ovs is being used
2023/12/20 01:28:04 [INFO]  [duration:1.262604272s] completed phase 2: ensure ovs is running
2023/12/20 01:28:04 [INFO]  executing phase 3: retrieving svid, check spire agent logs if this is the last line you see (time since start: 1.265393546s)
Dec 20 01:28:04.541 [INFO] SVID: "spiffe://k8s.nsm/ns/nsm-system/pod/forwarder-ovs-vlbt9"
2023/12/20 01:28:04 [INFO]  [duration:43.303477ms] completed phase 3: retrieving svid
2023/12/20 01:28:04 [INFO]  executing phase 4: create ovsxconnect network service endpoint (time since start: 1.308742579s)
Dec 20 01:28:04.542 [FATA] error configuring forwarder endpoint: 0000:3d:00.0 has no ServiceDomains set;        github.com/networkservicemesh/sdk-sriov/pkg/sriov/config.ReadConfig;           /go/pkg/mod/github.com/networkservicemesh/sdk-sriov@v1.11.1/pkg/sriov/config/config.go:117;     main.createSriovInterposeEndpoint;             /build/main.go:354;     main.createInterposeEndpoint;           /build/main.go:319;     main.main;              /build/main.go:197;    runtime.main;           /usr/local/go/src/runtime/proc.go:250;  runtime.goexit;         /usr/local/go/src/runtime/asm_amd64.s:1598;

thanks

316953425 commented 7 months ago

with the service domain

physicalFunctions:
  0000:3d:00.0:
    pfKernelDriver: i40e
    vfKernelDriver: i40evf
    capabilities:
      - intel
      - 1G
    serviceDomains:
      - worker.domain
[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# kubectl logs forwarder-ovs-b4jc2 -n nsm-system
I1220 01:58:07.267487   33840 ovs.go:98] Maximum command line arguments set to: 191102
Dec 20 01:58:07.268 [INFO] Setting env variable DLV_LISTEN_FORWARDER to a valid dlv '--listen' value will cause the dlv debugger to execute this binary and listen as directed.
2023/12/20 01:58:07 [INFO]  there are 5 phases which will be executed followed by a success message:
2023/12/20 01:58:07 [INFO]  the phases include:
2023/12/20 01:58:07 [INFO]  1: get config from environment
2023/12/20 01:58:07 [INFO]  2: ensure ovs is running
2023/12/20 01:58:07 [INFO]  3: retrieve spiffe svid
2023/12/20 01:58:07 [INFO]  4: create ovs forwarder network service endpoint
2023/12/20 01:58:07 [INFO]  5: create grpc server and register ovsxconnect
2023/12/20 01:58:07 [INFO]  6: register ovs forwarder network service with the registry
2023/12/20 01:58:07 [INFO]  a final success message with start time duration
2023/12/20 01:58:07 [INFO]  executing phase 1: get config from environment (time since start: 67.346µs)
This application is configured via the environment. The following environment
variables can be used:

KEY                              TYPE                                           DEFAULT                                                                               REQUIRED    DESCRIPTION
NSM_NAME                         String                                         forwarder                                                                                         Name of Endpoint
NSM_LABELS                       Comma-separated list of String:String pairs    p2p:true                                                                                          Labels related to this forwarder-vpp instance
NSM_NSNAME                       String                                         forwarder                                                                                         Name of Network Service to Register with Registry
NSM_BRIDGENAME                   String                                         br-nsm                                                                                            Name of the OvS bridge
NSM_TUNNEL_IP                    String                                                                                                                                           IP or CIDR to use for tunnels
NSM_CONNECT_TO                   URL                                            unix:///connect.to.socket                                                                         url to connect to
NSM_DIAL_TIMEOUT                 Duration                                       50ms                                                                                              Timeout for the dial the next endpoint
NSM_MAX_TOKEN_LIFETIME           Duration                                       24h                                                                                               maximum lifetime of tokens
NSM_REGISTRY_CLIENT_POLICIES     Comma-separated list of String                 etc/nsm/opa/common/.*.rego,etc/nsm/opa/registry/.*.rego,etc/nsm/opa/client/.*.rego                paths to files and directories that contain registry client policies
NSM_RESOURCE_POLL_TIMEOUT        Duration                                       30s                                                                                               device plugin polling timeout
NSM_DEVICE_PLUGIN_PATH           String                                         /var/lib/kubelet/device-plugins/                                                                  path to the device plugin directory
NSM_POD_RESOURCES_PATH           String                                         /var/lib/kubelet/pod-resources/                                                                   path to the pod resources directory
NSM_SRIOV_CONFIG_FILE            String                                         pci.config                                                                                        PCI resources config path
NSM_L2_RESOURCE_SELECTOR_FILE    String                                                                                                                                           config file for resource to label matching
NSM_PCI_DEVICES_PATH             String                                         /sys/bus/pci/devices                                                                              path to the PCI devices directory
NSM_PCI_DRIVERS_PATH             String                                         /sys/bus/pci/drivers                                                                              path to the PCI drivers directory
NSM_CGROUP_PATH                  String                                         /host/sys/fs/cgroup/devices                                                                       path to the host cgroup directory
NSM_VFIO_PATH                    String                                         /host/dev/vfio                                                                                    path to the host VFIO directory
NSM_LOG_LEVEL                    String                                         INFO                                                                                              Log level
NSM_OPENTELEMETRYENDPOINT        String                                         otel-collector.observability.svc.cluster.local:4317                                               OpenTelemetry Collector Endpoint
NSM_METRICS_EXPORT_INTERVAL      Duration                                       10s                                                                                               interval between mertics exports
2023/12/20 01:58:07 [INFO]  Config: &main.Config{Name:"forwarder-ovs-b4jc2", Labels:map[string]string{"p2p":"true"}, NSName:"forwarder", BridgeName:"br-nsm", TunnelIP:"172.16.102.11", ConnectTo:url.URL{Scheme:"unix", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/var/lib/networkservicemesh/nsm.io.sock", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}, DialTimeout:50000000, MaxTokenLifetime:86400000000000, RegistryClientPolicies:[]string{"etc/nsm/opa/common/.*.rego", "etc/nsm/opa/registry/.*.rego", "etc/nsm/opa/client/.*.rego"}, ResourcePollTimeout:30000000000, DevicePluginPath:"/var/lib/kubelet/device-plugins/", PodResourcesPath:"/var/lib/kubelet/pod-resources/", SRIOVConfigFile:"/var/lib/networkservicemesh/smartnic.config", L2ResourceSelectorFile:"", PCIDevicesPath:"/sys/bus/pci/devices", PCIDriversPath:"/sys/bus/pci/drivers", CgroupPath:"/host/sys/fs/cgroup/devices", VFIOPath:"/host/dev/vfio", LogLevel:"INFO", OpenTelemetryEndpoint:"otel-collector.observability.svc.cluster.local:4317", MetricsExportInterval:10000000000}
2023/12/20 01:58:07 [INFO]  [duration:2.641675ms] completed phase 1: get config from environment
2023/12/20 01:58:07 [INFO]  executing phase 2: ensure ovs is running (time since start: 2.745275ms)
2023/12/20 01:58:08 [INFO]  local ovs is being used
2023/12/20 01:58:08 [INFO]  [duration:1.270789462s] completed phase 2: ensure ovs is running
2023/12/20 01:58:08 [INFO]  executing phase 3: retrieving svid, check spire agent logs if this is the last line you see (time since start: 1.273562642s)
Dec 20 01:58:08.585 [INFO] SVID: "spiffe://k8s.nsm/ns/nsm-system/pod/forwarder-ovs-b4jc2"
2023/12/20 01:58:08 [INFO]  [duration:43.877516ms] completed phase 3: retrieving svid
2023/12/20 01:58:08 [INFO]  executing phase 4: create ovsxconnect network service endpoint (time since start: 1.317484424s)
Dec 20 01:58:08.586 [INFO] [Config:ReadConfig] unmarshalled Config: &{PhysicalFunctions:map[0000:3d:00.0:&{PFKernelDriver:i40e VFKernelDriver:i40evf Capabilities:[intel 1G] ServiceDomains:[worker.domain] VirtualFunctions:[]}]}
Dec 20 01:58:08.593 [FATA] error configuring forwarder endpoint: lstat /sys/bus/pci/devices/0000:3d:02.0/iommu_group: no such file or directory;      error getting info about specified file: /sys/bus/pci/devices/0000:3d:02.0/iommu_group;  github.com/networkservicemesh/sdk-sriov/pkg/sriov/pcifunction.evalSymlinkAndGetBaseName;               /go/pkg/mod/github.com/networkservicemesh/sdk-sriov@v1.11.1/pkg/sriov/pcifunction/tools.go:50;  github.com/networkservicemesh/sdk-sriov/pkg/sriov/pcifunction.(*Function).GetIOMMUGroup;               /go/pkg/mod/github.com/networkservicemesh/sdk-sriov@v1.11.1/pkg/sriov/pcifunction/function.go:75;      github.com/networkservicemesh/sdk-sriov/pkg/sriov/pci.UpdateConfig;             /go/pkg/mod/github.com/networkservicemesh/sdk-sriov@v1.11.1/pkg/sriov/pci/update_config.go:33; main.createSriovInterposeEndpoint;              /build/main.go:359;     main.createInterposeEndpoint;          /build/main.go:319;     main.main;              /build/main.go:197;     runtime.main;           /usr/local/go/src/runtime/proc.go:250; runtime.goexit;         /usr/local/go/src/runtime/asm_amd64.s:1598;
[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# dmesg | grep -E "DMAR|IOMMU"
[    0.016059] ACPI: DMAR 0x0000000068D7C1C8 000248 (v01 ALASKA A M I    00000001 INTL 20091013)
[    1.961752] DMAR: Host address width 46
[    1.961753] DMAR: DRHD base: 0x000000d37fc000 flags: 0x0
[    1.961760] DMAR: dmar0: reg_base_addr d37fc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    1.961761] DMAR: DRHD base: 0x000000e0ffc000 flags: 0x0
[    1.961765] DMAR: dmar1: reg_base_addr e0ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    1.961766] DMAR: DRHD base: 0x000000ee7fc000 flags: 0x0
[    1.961770] DMAR: dmar2: reg_base_addr ee7fc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    1.961771] DMAR: DRHD base: 0x000000fbffc000 flags: 0x0
[    1.961774] DMAR: dmar3: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    1.961775] DMAR: DRHD base: 0x000000aaffc000 flags: 0x0
[    1.961779] DMAR: dmar4: reg_base_addr aaffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    1.961780] DMAR: DRHD base: 0x000000b87fc000 flags: 0x0
[    1.961783] DMAR: dmar5: reg_base_addr b87fc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    1.961784] DMAR: DRHD base: 0x000000c5ffc000 flags: 0x0
[    1.961788] DMAR: dmar6: reg_base_addr c5ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    1.961789] DMAR: DRHD base: 0x0000009d7fc000 flags: 0x1
[    1.961792] DMAR: dmar7: reg_base_addr 9d7fc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    1.961793] DMAR: RMRR base: 0x0000006b624000 end: 0x0000006b635fff
[    1.961796] DMAR: ATSR flags: 0x0
[    1.961797] DMAR: RHSA base: 0x0000009d7fc000 proximity domain: 0x0
[    1.961798] DMAR: RHSA base: 0x000000aaffc000 proximity domain: 0x0
[    1.961798] DMAR: RHSA base: 0x000000b87fc000 proximity domain: 0x0
[    1.961799] DMAR: RHSA base: 0x000000c5ffc000 proximity domain: 0x0
[    1.961800] DMAR: RHSA base: 0x000000d37fc000 proximity domain: 0x1
[    1.961800] DMAR: RHSA base: 0x000000e0ffc000 proximity domain: 0x1
[    1.961801] DMAR: RHSA base: 0x000000ee7fc000 proximity domain: 0x1
[    1.961801] DMAR: RHSA base: 0x000000fbffc000 proximity domain: 0x1
[    1.961804] DMAR-IR: IOAPIC id 12 under DRHD base  0xc5ffc000 IOMMU 6
[    1.961805] DMAR-IR: IOAPIC id 11 under DRHD base  0xb87fc000 IOMMU 5
[    1.961806] DMAR-IR: IOAPIC id 10 under DRHD base  0xaaffc000 IOMMU 4
[    1.961807] DMAR-IR: IOAPIC id 18 under DRHD base  0xfbffc000 IOMMU 3
[    1.961808] DMAR-IR: IOAPIC id 17 under DRHD base  0xee7fc000 IOMMU 2
[    1.961809] DMAR-IR: IOAPIC id 16 under DRHD base  0xe0ffc000 IOMMU 1
[    1.961809] DMAR-IR: IOAPIC id 15 under DRHD base  0xd37fc000 IOMMU 0
[    1.961811] DMAR-IR: IOAPIC id 8 under DRHD base  0x9d7fc000 IOMMU 7
[    1.961812] DMAR-IR: IOAPIC id 9 under DRHD base  0x9d7fc000 IOMMU 7
[    1.961812] DMAR-IR: HPET id 0 under DRHD base 0x9d7fc000
[    1.961814] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    1.963751] DMAR-IR: Enabled IRQ remapping in x2apic mode
[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# lsmod | grep vfio_pci
vfio_pci               53248  0
vfio_virqfd            16384  1 vfio_pci
vfio                   32768  2 vfio_iommu_type1,vfio_pci
irqbypass              16384  2 vfio_pci,kvm

Perhaps it is caused by ”[FATA] error configuring forwarder endpoint: lstat /sys/bus/pci/devices/0000:3d:02.0/iommu_group: no such file or directory”

ljkiraly commented 7 months ago

Hi @316953425 ,

Perhaps it is caused by ”[FATA] error configuring forwarder endpoint: lstat /sys/bus/pci/devices/0000:3d:02.0/iommu_group: no such file or directory”

Yes, it's definitely related to that log printout. It's strange that ethtool shows bus info: 0000:3d:00.0, but the forwarder is looking for 0000:3d:02.0 (another device number: 02) Could you check where the sys-fs link is pointing? ls -l /sys/class/net/ens5/device

In theory one of the following might also cause such printout:

I've found this description: https://www.kernel.org/doc/html/latest/driver-api/vfio.html#vfio-usage-example Could you check the output of the command on one of the master nodes:
readlink /sys/bus/pci/devices/0000:3d:02.0/iommu_group

I'm not sure but if the IOMMU is enabled then the output should contain the "DMAR: IOMMU enabled" line. Could you also check the grub config of master nodes if contains the "intel_iommu=on" option?

316953425 commented 7 months ago

hi @glazychev-art @ljkiraly

Could you check where the sys-fs link is pointing? ls -l /sys/class/net/ens5/device

[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# ls -l /sys/class/net/ens5/device
lrwxrwxrwx 1 root root 0 12月 21 08:39 /sys/class/net/ens5/device -> ../../../0000:3d:00.0

readlink /sys/bus/pci/devices/0000:3d:02.0/iommu_group

[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# readlink /sys/bus/pci/devices/0000:3d:02.0/iommu_group
[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]#

[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# lspci | grep net
19:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
3d:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:00.1 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:00.2 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:00.3 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:02.1 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:02.2 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:02.3 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:02.4 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:02.5 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:02.6 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:02.7 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:03.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:03.1 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:03.2 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:03.3 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:03.4 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:03.5 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:03.6 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:03.7 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:04.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:04.1 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:04.2 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:04.3 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:04.4 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:04.5 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:04.6 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:04.7 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:05.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:05.1 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:05.2 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:05.3 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:05.4 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:05.5 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:05.6 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
3d:05.7 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 09)
5e:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

I'm not sure but if the IOMMU is enabled then the output should contain the "DMAR: IOMMU enabled" line. Could you also check the grub config of master nodes if contains the "intel_iommu=on" option?

yes

[root@CNCP-MS-01 deployments-k8s-release-v1.11.1]# cat /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto spectre_v2=retpoline rd.lvm.lv=centos/root rd.lvm.lv=centos/swap net.ifnames=0 biosdevname=0 rhgb quiet intel_iommu=on iommu=pt"
GRUB_DISABLE_RECOVERY="true"

It's strange that ethtool shows bus info: 0000:3d:00.0, but the forwarder is looking for 0000:3d:02.0 (another device number: 02)

This also makes me very confused, but the file /sys/bus/pci/devices/0000:3d:00.0/iommu_group does not exist either.

[root@CNCP-MS-01 use-cases]# ls /sys/bus/pci/devices/0000:3d:00.0/ | grep iommu_group
[root@CNCP-MS-01 use-cases]#
[root@CNCP-MS-01 use-cases]#

The network card used in my configuration (pciaddress:0000:3d:00.0) is not the network card currently used by k8s cni. I am not sure whether this has any impact?

Also, I don’t quite understand the meaning of serviceDomains. I filled in worker.domain according to the example in the document. Can you tell me the meaning of serviceDomains?

Finally, what I don’t quite understand is mentioned in the document https://github.com/networkservicemesh/deployments-k8s/tree/release/v1.11.1/examples/sriov Why the serviceDomains value of the master node is worker.domain, but the serviceDomains value of the worker node is master.domain,Could you tell me?

thanks

[root@CNCP-MS-02 ~]# cat /var/lib/networkservicemesh/smartnic.config
physicalFunctions:
  0000:3d:00.0:
    pfKernelDriver: i40e
    vfKernelDriver: i40evf
    capabilities:
      - intel
      - 1G
    serviceDomains:
      - worker.domain
ljkiraly commented 7 months ago

Hi @316953425 , @glazychev-art ,

Also, I don’t quite understand the meaning of serviceDomains. I filled in worker.domain according to the example in the document. Can you tell me the meaning of serviceDomains?

That part of your configuration is correct. I didn't wanted to confuse you. This gives a possibility to refer a physical resource by serviceDomain/capability (for example sriovToken=worker.domain/1G).

We should focus to network card configuration on the nodes. Did you restarted the node after the grub config (or this configuration was already present from the start)? Do you have any node where the forwarder starts properly?

316953425 commented 7 months ago

hi @glazychev-art @ljkiraly

Did you restarted the node after the grub config (or this configuration was already present from the start)?

yes ,I restart the node after the grub config

Do you have any node where the forwarder starts properly?

No, All nodes cannot start forward normally, and the error messages are the same.

316953425 commented 6 months ago

hi @glazychev-art Have we successfully deployed it(https://github.com/networkservicemesh/deployments-k8s/tree/release/v1.11.1/examples/ovs) before? thanks

glazychev-art commented 6 months ago

hi @316953425 We've discussed this issue a bit - we haven't actually run these examples for a while. Perhaps we will consider this issue in the next release.