k8snetworkplumbingwg / sriov-network-device-plugin

SRIOV network device plugin for Kubernetes
Apache License 2.0
410 stars 177 forks source link

ConnectX-7 don‘t create vf because don't enable SRIOV fuction #601

Open zhutong196 opened 2 months ago

zhutong196 commented 2 months ago

I can't create vf in my Connectx-7 card

system info: NAME="ctyunos" VERSION="2.0.1" ID="ctyunos" VERSION_ID="2.0.1" PRETTY_NAME="ctyunos 2.0.1" ANSI_COLOR="0;31"

mst status -v 
MST modules:
------------
    MST PCI module is not loaded
    MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE             MST                           PCI       RDMA            NET                                     NUMA  
ConnectX7(rev:0)        /dev/mst/mt4129_pciconf8      e3:00.0   mlx5_10         net-ib7                                 3     

ConnectX7(rev:0)        /dev/mst/mt4129_pciconf7      a2:00.0   mlx5_9          net-ib8                                 3     

ConnectX7(rev:0)        /dev/mst/mt4129_pciconf6      92:00.0   mlx5_6          net-ib6                                 2     

ConnectX7(rev:0)        /dev/mst/mt4129_pciconf5      8a:00.0   mlx5_5          net-ib5                                 2     

ConnectX7(rev:0)        /dev/mst/mt4129_pciconf4      85:00.0   mlx5_4          net-ib4                                 2     

ConnectX7(rev:0)        /dev/mst/mt4129_pciconf3      5c:00.0   mlx5_3          net-ib3                                 1     

ConnectX7(rev:0)        /dev/mst/mt4129_pciconf2      3b:00.0   mlx5_2          net-ib2                                 0     

ConnectX7(rev:0)        /dev/mst/mt4129_pciconf1      29:00.0   mlx5_1          net-ib1                                 0     

ConnectX7(rev:0)        /dev/mst/mt4129_pciconf0      19:00.0   mlx5_0          net-ib0                                 0     

ConnectX5(rev:0)        /dev/mst/mt4119_pciconf0.1    9f:00.1   mlx5_bond_0     net-bond0                               3     

ConnectX5(rev:0)        /dev/mst/mt4119_pciconf0      9f:00.0   mlx5_bond_0     net-bond0                               3

mlxconfig -d /dev/mst/mt4129_pciconf1 set SRIOV_EN=1 NUM_OF_VFS=16

Use the above command to enable the Sriov function and create 16 VFs, and then check through LSPCI that it is not enabled image

If it can be opened normally, the following information will be displayed image

rollandf commented 2 months ago

Hey, can you do: mlxconfig -d /dev/mst/mt4129_pciconf1 query

zhutong196 commented 2 months ago

mlxconfig -d /dev/mst/mt4129_pciconf1 query Device #1:

Device type: ConnectX7
Name: MCX75310AAS-NEA_Ax
Description: NVIDIA ConnectX-7 HHHL Adapter card; 400GbE / NDR IB (default mode); Single-port OSFP; PCIe 5.0 x16; Crypto Disabled; Secure Boot Enabled; Device: /dev/mst/mt4129_pciconf1

Configurations: Next Boot MODULE_SPLIT_M0 Array[0..15]
MEMIC_BAR_SIZE 0
MEMIC_SIZE_LIMIT _256KB(1)
MEMIC_ATOMIC MEMIC_ATOMIC_ENABLE(2) MEMIC_ATOMIC_ENDIANESS DEVICE_DEFAULT(0)
HOST_CHAINING_MODE DISABLED(0)
HOST_CHAINING_CACHE_DISABLE False(0)
HOST_CHAINING_DESCRIPTORS Array[0..7]
HOST_CHAINING_TOTAL_BUFFER_SIZE Array[0..7]
FLEX_PARSER_PROFILE_ENABLE 0
PROG_PARSE_GRAPH False(0)
FLEX_IPV4_OVER_VXLAN_PORT 0
ROCE_NEXT_PROTOCOL 254
ESWITCH_HAIRPIN_DESCRIPTORS Array[0..7]
ESWITCH_HAIRPIN_TOT_BUFFER_SIZE Array[0..7]
DPA_AUTHENTICATION True(1)
PF_BAR2_SIZE 0
PF_NUM_OF_VF_VALID False(0)
NON_PREFETCHABLE_PF_BAR False(0)
VF_VPD_ENABLE False(0)
PF_NUM_PF_MSIX_VALID False(0)
PER_PF_NUM_SF False(0)
STRICT_VF_MSIX_NUM False(0)
VF_NODNIC_ENABLE False(0)
NUM_PF_MSIX_VALID True(1)
NUM_OF_VFS 8
NUM_OF_PF 1
PF_BAR2_ENABLE False(0)
SRIOV_EN True(1)
PF_LOG_BAR_SIZE 5
VF_LOG_BAR_SIZE 1
NUM_PF_MSIX 63
NUM_VF_MSIX 11
INT_LOG_MAX_PAYLOAD_SIZE AUTOMATIC(0)
PCIE_CREDIT_TOKEN_TIMEOUT 0
RT_PPS_ENABLED_ON_POWERUP False(0)
LAG_RESOURCE_ALLOCATION DEVICE_DEFAULT(0)
ACCURATE_TX_SCHEDULER False(0)
PARTIAL_RESET_EN False(0)
RESET_WITH_HOST_ON_ERRORS False(0)
PCI_SWITCH_EMULATION_NUM_PORT 16
PCI_SWITCH_EMULATION_ENABLE False(0)
PCI_DOWNSTREAM_PORT_OWNER Array[0..15]
CQE_COMPRESSION BALANCED(0)
IP_OVER_VXLAN_EN False(0)
MKEY_BY_NAME False(0)
PRIO_TAG_REQUIRED_EN False(0)
UCTX_EN True(1)
REAL_TIME_CLOCK_ENABLE False(0)
RDMA_SELECTIVE_REPEAT_EN False(0)
PCI_ATOMIC_MODE PCI_ATOMIC_DISABLED_EXT_ATOMIC_ENABLED(0) TUNNEL_ECN_COPY_DISABLE False(0)
LRO_LOG_TIMEOUT0 6
LRO_LOG_TIMEOUT1 7
LRO_LOG_TIMEOUT2 8
LRO_LOG_TIMEOUT3 13
LOG_TX_PSN_WINDOW 9
VF_MIGRATION_MODE DEVICE_DEFAULT(0)
LOG_MAX_OUTSTANDING_WQE 7
ROCE_ADAPTIVE_ROUTING_EN False(0)
TUNNEL_IP_PROTO_ENTROPY_DISABLE False(0)
PCC_HANDLE_CORE_UTIL DEVICE_DEFAULT(0)
PCC_INT_NP_RTT_DSCP 26
PCC_INT_NP_RTT_DSCP_EN False(0)
PCC_INT_NP_RTT_DATA_MODE RTT_V0(64)
PCC_INT_EN False(0)
PCC_INT_SYSTEM_RTT 0
STEERING_CACHE_REFRESH 0
TX_SCHEDULER_LOCALITY_MODE DEVICE_DEFAULT(0)
ICM_CACHE_MODE DEVICE_DEFAULT(0)
TX_SCHEDULER_FWS_REACTIVITY DIRECT(1)
HAIRPIN_DATA_BUFFER_LOCK False(0)
TX_SCHEDULER_BURST 0
ZERO_TOUCH_TUNING_ENABLE False(0)
ROCE_CC_DCQCN_COMPATIBILITY_MODE DEVICE_DEFAULT(0)
ROCE_CC_LEGACY_DCQCN_SW False(0)
LOG_MAX_QUEUE 17
LOG_MAX_OUTSTANDING_READ_ATOMIC 0
SWP_L4_CHECKSUM_MODE DEVICE_DEFAULT(0)
LARGE_MTU_TWEAK_64 False(0)
AES_XTS_TWEAK_INC_64 False(0)
CRYPTO_POLICY UNRESTRICTED(1)
RDE_DISABLE False(0)
PLDM_FW_UPDATE_DISABLE False(0)
RBT_DISABLE False(0)
PCIE_SMBUS_DISABLE False(0)
PCIE_IN_BAND_VDM_DISABLE False(0)
ADVANCED_TESTABILITY False(0)
LOG_DCR_HASH_TABLE_SIZE 11
MAX_PACKET_LIFETIME 0
DCR_LIFO_SIZE 16384
LINK_TYPE_P1 IB(1)
ROCE_CC_PRIO_MASK_P1 255
ROCE_CC_CNP_MODERATION_P1 DEVICE_DEFAULT(0)
ROCE_CC_SHAPER_COALESCE_P1 DEVICE_DEFAULT(0)
IB_CC_SHAPER_COALESCE_P1 DEVICE_DEFAULT(0)
CLAMP_TGT_RATE_AFTER_TIME_INC_P1 True(1)
CLAMP_TGT_RATE_P1 False(0)
RPG_TIME_RESET_P1 300
RPG_BYTE_RESET_P1 32767
RPG_THRESHOLD_P1 1
RPG_MAX_RATE_P1 0
RPG_AI_RATE_P1 5
RPG_HAI_RATE_P1 50
RPG_GD_P1 11
RPG_MIN_DEC_FAC_P1 50
RPG_MIN_RATE_P1 1
RATE_TO_SET_ON_FIRST_CNP_P1 0
DCE_TCP_G_P1 1019
DCE_TCP_RTT_P1 1
RATE_REDUCE_MONITOR_PERIOD_P1 4
INITIAL_ALPHA_VALUE_P1 1023
MIN_TIME_BETWEEN_CNPS_P1 4
CNP_802P_PRIO_P1 6
CNP_DSCP_P1 48
LLDP_NB_DCBX_P1 False(0)
LLDP_NB_RX_MODE_P1 OFF(0)
LLDP_NB_TX_MODE_P1 OFF(0)
ROCE_RTT_RESP_DSCP_P1 0
ROCE_RTT_RESP_DSCP_MODE_P1 DEVICE_DEFAULT(0)
DCBX_IEEE_P1 True(1)
DCBX_CEE_P1 True(1)
DCBX_WILLING_P1 True(1)
KEEP_ETH_LINK_UP_P1 True(1)
KEEP_IB_LINK_UP_P1 False(0)
KEEP_LINK_UP_ON_BOOT_P1 False(0)
KEEP_LINK_UP_ON_STANDBY_P1 False(0)
DO_NOT_CLEAR_PORT_STATS_P1 False(0)
AUTO_POWER_SAVE_LINK_DOWN_P1 False(0)
NUM_OF_VL_P1 _4_VLs(3)
NUM_OF_TC_P1 _8_TCs(0)
NUM_OF_PFC_P1 8
VL15_BUFFER_SIZE_P1 0
QOS_TRUST_STATE_P1 TRUST_PCP(1)
ETS_SCHED_MODE_P1 device_default(0)
DUP_MAC_ACTION_P1 LAST_CFG(0)
MPFS_MC_LOOPBACK_DISABLE_P1 False(0)
MPFS_UC_LOOPBACK_DISABLE_P1 False(0)
SRIOV_IB_ROUTING_MODE_P1 LID(1)
IB_ROUTING_MODE_P1 LID(1)
PHY_AUTO_NEG_P1 DEVICE_DEFAULT(0)
PHY_RATE_MASK_OVERRIDE_P1 False(0)
PHY_FEC_OVERRIDE_P1 DEVICE_DEFAULT(0)
PF_TOTAL_SF 0
PF_SD_GROUP 0
PF_DEVICE_ID_ENABLE False(0)
PF_SF_BAR_SIZE 0
PF_NUM_PF_MSIX 63
PF_DEVICE_ID 4129
SILENT_MODE False(0)
MKEY_BY_NAME_RANGE DEVICE_DEFAULT(0)
ROCE_CONTROL ROCE_ENABLE(2)
PCI_WR_ORDERING per_mkey(0)
MULTI_PORT_VHCA_EN False(0)
PORT_OWNER True(1)
ALLOW_RD_COUNTERS True(1)
RENEG_ON_CHANGE True(1)
TRACER_ENABLE True(1)
IP_VER IPv4(0)
BOOT_UNDI_NETWORK_WAIT 0
UEFI_HII_EN True(1)
BOOT_DBG_LOG False(0)
UEFI_LOGS DISABLED(0)
BOOT_VLAN 1
LEGACY_BOOT_PROTOCOL PXE(1)
BOOT_INTERRUPT_DIS False(0)
BOOT_LACP_DIS True(1)
BOOT_VLAN_EN False(0)
BOOT_PKEY 0
BAR_PAGE_ALIGNMENT DEVICE_DEFAULT(0)
P2P_ORDERING_MODE DEVICE_DEFAULT(0)
ATS_ENABLED False(0)
DYNAMIC_VF_MSIX_TABLE False(0)
EXP_ROM_UEFI_ARM_ENABLE True(1)
EXP_ROM_UEFI_x86_ENABLE True(1)
EXP_ROM_PXE_ENABLE True(1)
ADVANCED_PCI_SETTINGS False(0)
SAFE_MODE_THRESHOLD 10
SAFE_MODE_ENABLE True(1)

rollandf commented 2 months ago

Changing NUM_OF_VFS require to reboot the server. Did you reboot?

zhutong196 commented 2 months ago

Changing NUM_OF_VFS require to reboot the server. Did you reboot?

I have restarted many times already

adrianchiris commented 1 month ago

can you paste the output of:

cat /sys/bus/pci/devices/<NIC PCI ADDRESS>/sriov_totalvfs

to create VFs you need to write the desired number of VFs to the following file:

echo 8 > /sys/bus/pci/devices/<NIC PCI ADDRESS>/sriov_numvfs

can you also paste the output of: mlxconfig -d <NIC PCI ADDRESS> -e q