Closed juggarnautss closed 1 month ago
Need assistance on this behavior as we understood that QAT VF endpoints comes up after reading the configuration files present in /etc directory to preserve the config setting wrt to each VF end point. Hence, missing status of endpoints will the break configuration binding with each VF endpoint.
Can you clarify what is the configuration you're expecting to see? The Kubernetes QAT plugin only knows about VFs that are bound to vfio-pci
. If it finds a VF that does not have vfio-pci
, it does the job. I don't know about qat_service status
but it probably only lists devices that have either 4xxx
or 4xxxvf
. When you deploy the QAT plugin, you don't have 4xxxvf
s anymore.
Note that your driver setup if based on the OOT driver which isn't a supported setup neither for qatlib nor for the Kubernetes QAT plugin.
Need assistance on this behavior as we understood that QAT VF endpoints comes up after reading the configuration files present in /etc directory to preserve the config setting wrt to each VF end point. Hence, missing status of endpoints will the break configuration binding with each VF endpoint.
Can you clarify what is the configuration you're expecting to see? The Kubernetes QAT plugin only knows about VFs that are bound to
vfio-pci
. If it finds a VF that does not havevfio-pci
, it does the job. I don't know aboutqat_service status
but it probably only lists devices that have either4xxx
or4xxxvf
. When you deploy the QAT plugin, you don't have4xxxvf
s anymore.Note that your driver setup if based on the OOT driver which isn't a supported setup neither for qatlib nor for the Kubernetes QAT plugin.
@mythi Thank you for your response. Ok, that means it is expected that till the time K8s qat plugin is not installed , "qat_service" will show the status of 4XXX and 4XXXVF , but post qat plugin installation 4xxxxvf doesn't exist and hence we don't see the status. Would you please elaborate why they don't exist post qat plugin installation ?
Would you please elaborate why they don't exist post qat plugin installation ?
It was mentioned in my earlier comment: "The Kubernetes QAT plugin only knows about VFs that are bound to vfio-pci
. If it finds a VF that does not have vfio-pci
, it does the job."
A few things to be aware of: it looks you are using the out-of-tree driver stack. That is not applicable to qatlib and the k8s qat plugin setup. In addition, when using QAT in a k8s cluster, the host OS does not need qatlib
/ qat_service
installed because the Helm setup supports the equivalent functionality.
Closing this as the questions are answered above.
We are using intel Saffhire Rapids processor with integrated QAT processors. After OS installation/, we are configuring QAT config files /etc directory and then starting the qat service using "/etc/init.d/qat_service start".
QAT config files: sysadmin@controller-0:/var/log$ ls -lrt /etc | grep 4xxx -rw-r----- 1 root root 5315 Apr 16 10:39 4xxx_dev0.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev0.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev1.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev2.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev3.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev4.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev5.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev6.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev7.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev8.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev9.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev10.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev11.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev12.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev13.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev14.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev15.conf -rw-r----- 1 root root 5315 Apr 16 10:39 4xxx_dev1.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev16.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev17.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev18.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev19.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev20.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev21.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev22.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev23.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev24.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev25.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev26.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev27.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev28.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev29.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev30.conf -rw-r----- 1 root root 4383 Apr 16 10:39 4xxxvf_dev31.conf
QAT service status: sysadmin@controller-0:/var/log$ sudo /etc/init.d/qat_service status Checking status of all devices. There is 34 QAT acceleration device(s) in the system: qat_dev0 - type: 4xxx, inst_id: 0, node_id: 0, bsf: 0000:f3:00.0, #accel: 1 #engines: 9 state: up qat_dev1 - type: 4xxx, inst_id: 1, node_id: 0, bsf: 0000:f7:00.0, #accel: 1 #engines: 9 state: up qat_dev2 - type: 4xxxvf, inst_id: 0, node_id: 0, bsf: 0000:f3:00.1, #accel: 1 #engines: 1 state: up qat_dev3 - type: 4xxxvf, inst_id: 1, node_id: 0, bsf: 0000:f3:00.2, #accel: 1 #engines: 1 state: up qat_dev4 - type: 4xxxvf, inst_id: 2, node_id: 0, bsf: 0000:f3:00.3, #accel: 1 #engines: 1 state: up qat_dev5 - type: 4xxxvf, inst_id: 3, node_id: 0, bsf: 0000:f3:00.4, #accel: 1 #engines: 1 state: up qat_dev6 - type: 4xxxvf, inst_id: 4, node_id: 0, bsf: 0000:f3:00.5, #accel: 1 #engines: 1 state: up qat_dev7 - type: 4xxxvf, inst_id: 5, node_id: 0, bsf: 0000:f3:00.6, #accel: 1 #engines: 1 state: up qat_dev8 - type: 4xxxvf, inst_id: 6, node_id: 0, bsf: 0000:f3:00.7, #accel: 1 #engines: 1 state: up qat_dev9 - type: 4xxxvf, inst_id: 7, node_id: 0, bsf: 0000:f3:01.0, #accel: 1 #engines: 1 state: up qat_dev10 - type: 4xxxvf, inst_id: 8, node_id: 0, bsf: 0000:f3:01.1, #accel: 1 #engines: 1 state: up qat_dev11 - type: 4xxxvf, inst_id: 9, node_id: 0, bsf: 0000:f3:01.2, #accel: 1 #engines: 1 state: up qat_dev12 - type: 4xxxvf, inst_id: 10, node_id: 0, bsf: 0000:f3:01.3, #accel: 1 #engines: 1 state: up qat_dev13 - type: 4xxxvf, inst_id: 11, node_id: 0, bsf: 0000:f3:01.4, #accel: 1 #engines: 1 state: up qat_dev14 - type: 4xxxvf, inst_id: 12, node_id: 0, bsf: 0000:f3:01.5, #accel: 1 #engines: 1 state: up qat_dev15 - type: 4xxxvf, inst_id: 13, node_id: 0, bsf: 0000:f3:01.6, #accel: 1 #engines: 1 state: up qat_dev16 - type: 4xxxvf, inst_id: 14, node_id: 0, bsf: 0000:f3:01.7, #accel: 1 #engines: 1 state: up qat_dev17 - type: 4xxxvf, inst_id: 15, node_id: 0, bsf: 0000:f3:02.0, #accel: 1 #engines: 1 state: up qat_dev18 - type: 4xxxvf, inst_id: 16, node_id: 0, bsf: 0000:f7:00.1, #accel: 1 #engines: 1 state: up qat_dev19 - type: 4xxxvf, inst_id: 17, node_id: 0, bsf: 0000:f7:00.2, #accel: 1 #engines: 1 state: up qat_dev20 - type: 4xxxvf, inst_id: 18, node_id: 0, bsf: 0000:f7:00.3, #accel: 1 #engines: 1 state: up qat_dev21 - type: 4xxxvf, inst_id: 19, node_id: 0, bsf: 0000:f7:00.4, #accel: 1 #engines: 1 state: up qat_dev22 - type: 4xxxvf, inst_id: 20, node_id: 0, bsf: 0000:f7:00.5, #accel: 1 #engines: 1 state: up qat_dev23 - type: 4xxxvf, inst_id: 21, node_id: 0, bsf: 0000:f7:00.6, #accel: 1 #engines: 1 state: up qat_dev24 - type: 4xxxvf, inst_id: 22, node_id: 0, bsf: 0000:f7:00.7, #accel: 1 #engines: 1 state: up qat_dev25 - type: 4xxxvf, inst_id: 23, node_id: 0, bsf: 0000:f7:01.0, #accel: 1 #engines: 1 state: up qat_dev26 - type: 4xxxvf, inst_id: 24, node_id: 0, bsf: 0000:f7:01.1, #accel: 1 #engines: 1 state: up qat_dev27 - type: 4xxxvf, inst_id: 25, node_id: 0, bsf: 0000:f7:01.2, #accel: 1 #engines: 1 state: up qat_dev28 - type: 4xxxvf, inst_id: 26, node_id: 0, bsf: 0000:f7:01.3, #accel: 1 #engines: 1 state: up qat_dev29 - type: 4xxxvf, inst_id: 27, node_id: 0, bsf: 0000:f7:01.4, #accel: 1 #engines: 1 state: up qat_dev30 - type: 4xxxvf, inst_id: 28, node_id: 0, bsf: 0000:f7:01.5, #accel: 1 #engines: 1 state: up qat_dev31 - type: 4xxxvf, inst_id: 29, node_id: 0, bsf: 0000:f7:01.6, #accel: 1 #engines: 1 state: up qat_dev32 - type: 4xxxvf, inst_id: 30, node_id: 0, bsf: 0000:f7:01.7, #accel: 1 #engines: 1 state: up qat_dev33 - type: 4xxxvf, inst_id: 31, node_id: 0, bsf: 0000:f7:02.0, #accel: 1 #engines: 1 state: up
After then, we have used the helm charts to deploy kubernetes intel QAT plugin. Here we have not used initcontainer to provision the qat devices as we are doing after our os installation mentioned above. After installation we can see vf endpoints are exposed to kubernetes cluster.
helm repo add intel https://intel.github.io/helm-charts/ helm repo update helm install qat-device-plugin intel/intel-device-plugins-qat
After installation we can see vf endpoints are exposed to kubernetes cluster.
Node description: Capacity: cpu: 64 ephemeral-storage: 10218772Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 129160204Ki pods: 110 qat.intel.com/generic: 32 Allocatable: cpu: 62 ephemeral-storage: 9417620260 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 118817804Ki pods: 110 qat.intel.com/generic: 32
Observation: Now when we check the status of QAT endpoints using "qat_service status" command, the output only PF status and doesnt show any QAT VF endpoint status as it was showing before qat plugin installation.
sysadmin@controller-0:/var/log$ sudo /etc/init.d/qat_service status Checking status of all devices. There is 2 QAT acceleration device(s) in the system: qat_dev0 - type: 4xxx, inst_id: 0, node_id: 0, bsf: 0000:f3:00.0, #accel: 1 #engines: 9 state: up qat_dev1 - type: 4xxx, inst_id: 1, node_id: 0, bsf: 0000:f7:00.0, #accel: 1 #engines: 9 state: up
Need assistance on this behavior as we understood that QAT VF endpoints comes up after reading the configuration files present in /etc directory to preserve the config setting wrt to each VF end point. Hence, missing status of endpoints will the break configuration binding with each VF endpoint.