intel / qatlib

Other
95 stars 34 forks source link

Does qatlib config file support affinitizing instances to specific cores? #75

Closed neel-patel-1 closed 7 months ago

neel-patel-1 commented 8 months ago

On a dual-socket server with one QAT device per socket I first bind both devices to the driver and find that the default setting affinitizes 4 consecutive physical cores to different sockets' devices in an interleaved fashion.

n869p538@sapphire:qatlib$ ./inst_vf_perf Number of available dc instances: 64 Inst 0, Affin: 0, Dev: 0, Accel 0, EE 0, BDF 76:01:00 Inst 1, Affin: 1, Dev: 0, Accel 0, EE 0, BDF 76:01:00 Inst 2, Affin: 2, Dev: 0, Accel 0, EE 0, BDF 76:01:00 Inst 3, Affin: 3, Dev: 0, Accel 0, EE 0, BDF 76:01:00 Inst 4, Affin: 4, Dev: 1, Accel 1, EE 0, BDF F3:01:00 Inst 5, Affin: 5, Dev: 1, Accel 1, EE 0, BDF F3:01:00 Inst 6, Affin: 6, Dev: 1, Accel 1, EE 0, BDF F3:01:00 Inst 7, Affin: 7, Dev: 1, Accel 1, EE 0, BDF F3:01:00 Inst 8, Affin: 8, Dev: 0, Accel 2, EE 0, BDF 76:02:00 Inst 9, Affin: 9, Dev: 0, Accel 2, EE 0, BDF 76:02:00 Inst 10, Affin: 10, Dev: 0, Accel 2, EE 0, BDF 76:02:00 Inst 11, Affin: 11, Dev: 0, Accel 2, EE 0, BDF 76:02:00 Inst 12, Affin: 12, Dev: 1, Accel 3, EE 0, BDF F3:02:00 Inst 13, Affin: 13, Dev: 1, Accel 3, EE 0, BDF F3:02:00 Inst 14, Affin: 14, Dev: 1, Accel 3, EE 0, BDF F3:02:00 Inst 15, Affin: 15, Dev: 1, Accel 3, EE 0, BDF F3:02:00 Inst 16, Affin: 16, Dev: 0, Accel 4, EE 0, BDF 76:00:01 Inst 17, Affin: 17, Dev: 0, Accel 4, EE 0, BDF 76:00:01 Inst 18, Affin: 18, Dev: 0, Accel 4, EE 0, BDF 76:00:01 Inst 19, Affin: 19, Dev: 0, Accel 4, EE 0, BDF 76:00:01 Inst 20, Affin: 20, Dev: 1, Accel 5, EE 0, BDF F3:00:01 Inst 21, Affin: 21, Dev: 1, Accel 5, EE 0, BDF F3:00:01 Inst 22, Affin: 22, Dev: 1, Accel 5, EE 0, BDF F3:00:01 Inst 23, Affin: 23, Dev: 1, Accel 5, EE 0, BDF F3:00:01 Inst 24, Affin: 24, Dev: 0, Accel 6, EE 0, BDF 76:01:01 Inst 25, Affin: 25, Dev: 0, Accel 6, EE 0, BDF 76:01:01 Inst 26, Affin: 26, Dev: 0, Accel 6, EE 0, BDF 76:01:01 Inst 27, Affin: 27, Dev: 0, Accel 6, EE 0, BDF 76:01:01 Inst 28, Affin: 28, Dev: 1, Accel 7, EE 0, BDF F3:01:01 Inst 29, Affin: 29, Dev: 1, Accel 7, EE 0, BDF F3:01:01 Inst 30, Affin: 30, Dev: 1, Accel 7, EE 0, BDF F3:01:01 Inst 31, Affin: 31, Dev: 1, Accel 7, EE 0, BDF F3:01:01 Inst 32, Affin: 32, Dev: 0, Accel 8, EE 0, BDF 76:00:02 Inst 33, Affin: 33, Dev: 0, Accel 8, EE 0, BDF 76:00:02 Inst 34, Affin: 34, Dev: 0, Accel 8, EE 0, BDF 76:00:02 Inst 35, Affin: 35, Dev: 0, Accel 8, EE 0, BDF 76:00:02 Inst 36, Affin: 36, Dev: 1, Accel 9, EE 0, BDF F3:00:02 Inst 37, Affin: 37, Dev: 1, Accel 9, EE 0, BDF F3:00:02 Inst 38, Affin: 38, Dev: 1, Accel 9, EE 0, BDF F3:00:02 Inst 39, Affin: 39, Dev: 1, Accel 9, EE 0, BDF F3:00:02 Inst 40, Affin: 40, Dev: 0, Accel 10, EE 0, BDF 76:01:02 Inst 41, Affin: 41, Dev: 0, Accel 10, EE 0, BDF 76:01:02 Inst 42, Affin: 42, Dev: 0, Accel 10, EE 0, BDF 76:01:02 Inst 43, Affin: 43, Dev: 0, Accel 10, EE 0, BDF 76:01:02 Inst 44, Affin: 44, Dev: 1, Accel 11, EE 0, BDF F3:01:02 Inst 45, Affin: 45, Dev: 1, Accel 11, EE 0, BDF F3:01:02 Inst 46, Affin: 46, Dev: 1, Accel 11, EE 0, BDF F3:01:02 Inst 47, Affin: 47, Dev: 1, Accel 11, EE 0, BDF F3:01:02 Inst 48, Affin: 48, Dev: 0, Accel 12, EE 0, BDF 76:00:03 Inst 49, Affin: 49, Dev: 0, Accel 12, EE 0, BDF 76:00:03 Inst 50, Affin: 50, Dev: 0, Accel 12, EE 0, BDF 76:00:03 Inst 51, Affin: 51, Dev: 0, Accel 12, EE 0, BDF 76:00:03 Inst 52, Affin: 52, Dev: 1, Accel 13, EE 0, BDF F3:00:03 Inst 53, Affin: 53, Dev: 1, Accel 13, EE 0, BDF F3:00:03 Inst 54, Affin: 54, Dev: 1, Accel 13, EE 0, BDF F3:00:03 Inst 55, Affin: 55, Dev: 1, Accel 13, EE 0, BDF F3:00:03 Inst 56, Affin: 56, Dev: 0, Accel 14, EE 0, BDF 76:01:03 Inst 57, Affin: 57, Dev: 0, Accel 14, EE 0, BDF 76:01:03 Inst 58, Affin: 58, Dev: 0, Accel 14, EE 0, BDF 76:01:03 Inst 59, Affin: 59, Dev: 0, Accel 14, EE 0, BDF 76:01:03 Inst 60, Affin: 60, Dev: 1, Accel 15, EE 0, BDF F3:01:03 Inst 61, Affin: 61, Dev: 1, Accel 15, EE 0, BDF F3:01:03 Inst 62, Affin: 62, Dev: 1, Accel 15, EE 0, BDF F3:01:03 Inst 63, Affin: 63, Dev: 1, Accel 15, EE 0, BDF F3:01:03

After unbinding one device from the driver, cores from first socket are affinitized to the device on the second: n869p538@sapphire:qatlib$ ./inst_vf_perf Number of available dc instances: 64 Inst 0, Affin: 0, Dev: 0, Accel 0, EE 0, BDF F3:01:00 Inst 1, Affin: 1, Dev: 0, Accel 0, EE 0, BDF F3:01:00 Inst 2, Affin: 2, Dev: 0, Accel 0, EE 0, BDF F3:01:00 Inst 3, Affin: 3, Dev: 0, Accel 0, EE 0, BDF F3:01:00 Inst 4, Affin: 4, Dev: 0, Accel 1, EE 0, BDF F3:02:00 Inst 5, Affin: 5, Dev: 0, Accel 1, EE 0, BDF F3:02:00 Inst 6, Affin: 6, Dev: 0, Accel 1, EE 0, BDF F3:02:00 Inst 7, Affin: 7, Dev: 0, Accel 1, EE 0, BDF F3:02:00 Inst 8, Affin: 8, Dev: 0, Accel 2, EE 0, BDF F3:00:01 Inst 9, Affin: 9, Dev: 0, Accel 2, EE 0, BDF F3:00:01 Inst 10, Affin: 10, Dev: 0, Accel 2, EE 0, BDF F3:00:01 Inst 11, Affin: 11, Dev: 0, Accel 2, EE 0, BDF F3:00:01 Inst 12, Affin: 12, Dev: 0, Accel 3, EE 0, BDF F3:01:01 Inst 13, Affin: 13, Dev: 0, Accel 3, EE 0, BDF F3:01:01 Inst 14, Affin: 14, Dev: 0, Accel 3, EE 0, BDF F3:01:01 Inst 15, Affin: 15, Dev: 0, Accel 3, EE 0, BDF F3:01:01 Inst 16, Affin: 16, Dev: 0, Accel 4, EE 0, BDF F3:00:02 Inst 17, Affin: 17, Dev: 0, Accel 4, EE 0, BDF F3:00:02 Inst 18, Affin: 18, Dev: 0, Accel 4, EE 0, BDF F3:00:02 Inst 19, Affin: 19, Dev: 0, Accel 4, EE 0, BDF F3:00:02 Inst 20, Affin: 20, Dev: 0, Accel 5, EE 0, BDF F3:01:02 Inst 21, Affin: 21, Dev: 0, Accel 5, EE 0, BDF F3:01:02 Inst 22, Affin: 22, Dev: 0, Accel 5, EE 0, BDF F3:01:02 Inst 23, Affin: 23, Dev: 0, Accel 5, EE 0, BDF F3:01:02 Inst 24, Affin: 24, Dev: 0, Accel 6, EE 0, BDF F3:00:03 Inst 25, Affin: 25, Dev: 0, Accel 6, EE 0, BDF F3:00:03 Inst 26, Affin: 26, Dev: 0, Accel 6, EE 0, BDF F3:00:03 Inst 27, Affin: 27, Dev: 0, Accel 6, EE 0, BDF F3:00:03 Inst 28, Affin: 28, Dev: 0, Accel 7, EE 0, BDF F3:01:03 Inst 29, Affin: 29, Dev: 0, Accel 7, EE 0, BDF F3:01:03 Inst 30, Affin: 30, Dev: 0, Accel 7, EE 0, BDF F3:01:03 Inst 31, Affin: 31, Dev: 0, Accel 7, EE 0, BDF F3:01:03 Inst 32, Affin: 32, Dev: 0, Accel 8, EE 0, BDF F3:00:04 Inst 33, Affin: 33, Dev: 0, Accel 8, EE 0, BDF F3:00:04 Inst 34, Affin: 34, Dev: 0, Accel 8, EE 0, BDF F3:00:04 Inst 35, Affin: 35, Dev: 0, Accel 8, EE 0, BDF F3:00:04 Inst 36, Affin: 36, Dev: 0, Accel 9, EE 0, BDF F3:01:04 Inst 37, Affin: 37, Dev: 0, Accel 9, EE 0, BDF F3:01:04 Inst 38, Affin: 38, Dev: 0, Accel 9, EE 0, BDF F3:01:04 Inst 39, Affin: 39, Dev: 0, Accel 9, EE 0, BDF F3:01:04 Inst 40, Affin: 0, Dev: 0, Accel 10, EE 0, BDF F3:00:05 Inst 41, Affin: 1, Dev: 0, Accel 10, EE 0, BDF F3:00:05 Inst 42, Affin: 2, Dev: 0, Accel 10, EE 0, BDF F3:00:05 Inst 43, Affin: 3, Dev: 0, Accel 10, EE 0, BDF F3:00:05 Inst 44, Affin: 4, Dev: 0, Accel 11, EE 0, BDF F3:01:05 Inst 45, Affin: 5, Dev: 0, Accel 11, EE 0, BDF F3:01:05 Inst 46, Affin: 6, Dev: 0, Accel 11, EE 0, BDF F3:01:05 Inst 47, Affin: 7, Dev: 0, Accel 11, EE 0, BDF F3:01:05 Inst 48, Affin: 8, Dev: 0, Accel 12, EE 0, BDF F3:00:06 Inst 49, Affin: 9, Dev: 0, Accel 12, EE 0, BDF F3:00:06 Inst 50, Affin: 10, Dev: 0, Accel 12, EE 0, BDF F3:00:06 Inst 51, Affin: 11, Dev: 0, Accel 12, EE 0, BDF F3:00:06 Inst 52, Affin: 12, Dev: 0, Accel 13, EE 0, BDF F3:01:06 Inst 53, Affin: 13, Dev: 0, Accel 13, EE 0, BDF F3:01:06 Inst 54, Affin: 14, Dev: 0, Accel 13, EE 0, BDF F3:01:06 Inst 55, Affin: 15, Dev: 0, Accel 13, EE 0, BDF F3:01:06 Inst 56, Affin: 16, Dev: 0, Accel 14, EE 0, BDF F3:00:07 Inst 57, Affin: 17, Dev: 0, Accel 14, EE 0, BDF F3:00:07 Inst 58, Affin: 18, Dev: 0, Accel 14, EE 0, BDF F3:00:07 Inst 59, Affin: 19, Dev: 0, Accel 14, EE 0, BDF F3:00:07 Inst 60, Affin: 20, Dev: 0, Accel 15, EE 0, BDF F3:01:07 Inst 61, Affin: 21, Dev: 0, Accel 15, EE 0, BDF F3:01:07 Inst 62, Affin: 22, Dev: 0, Accel 15, EE 0, BDF F3:01:07 Inst 63, Affin: 23, Dev: 0, Accel 15, EE 0, BDF F3:01:07

Is there a simple way to configure the core affinities similar to how affinities were specified in QAT 2.0 /1.7 sw distribution via configuration files?

neel-patel-1 commented 8 months ago

Or is there a way to use the old config file format form QAT2.0/1.7 instead of the qatlib format while still using qatlib?

gcabiddu commented 8 months ago

Hi @neel-patel-1, there isn't a simple way to configure core affinities in qatlib. Also the old config files are not supported. At the moment qatlib sets the CoreAffinity attribute by increasing a counter. Would it be possible to know more about your use case to see how this can be sorted?

Thanks.

neel-patel-1 commented 8 months ago

I am running a two socket server each with a single 4xxx device. I would like to use a single device, and assign one physical core to each VF.

I would like a mapping like: core 0 -> 76:01:00 core 1 -> 76:02:00 ...

As I would prefer cores not to share QAT engine's / resources.

I imagine this would be a way to provide optimal performance for each core (assigning a single VF to each), though I am not sure if there is a preferred method to prevent cores from sharing QAT resources under qatlib.

Ideally I would not have cores accessing the QAT device on the other numa node like below: Inst 0, Affin: 0, Dev: 0, Accel 0, EE 0, BDF 76:01:00 Inst 1, Affin: 1, Dev: 0, Accel 0, EE 0, BDF 76:01:00 Inst 2, Affin: 2, Dev: 0, Accel 0, EE 0, BDF 76:01:00 Inst 3, Affin: 3, Dev: 0, Accel 0, EE 0, BDF 76:01:00 Inst 4, Affin: 4, Dev: 1, Accel 1, EE 0, BDF F3:01:00 Inst 5, Affin: 5, Dev: 1, Accel 1, EE 0, BDF F3:01:00 Inst 6, Affin: 6, Dev: 1, Accel 1, EE 0, BDF F3:01:00 Inst 7, Affin: 7, Dev: 1, Accel 1, EE 0, BDF F3:01:00

Thanks

gcabiddu commented 8 months ago

Thanks @neel-patel-1. There is a change that is getting worked at the moment to report CoreAffinities associated to NUMA nodes, i.e. the core affinity will always a core on the same socket as the accelerator.

Temporarily, can this be handled by your application?

neel-patel-1 commented 8 months ago

We would like to be able to affinitize, but stay on kernel version 6.5.7. Do you know if there is a version of the QAT2.0 distribution that is compatible with kernels newer than 5.15.x?

gcabiddu commented 8 months ago

No, I don't know. @jdschuet, do you know?

jdschuet commented 8 months ago

We are in process of posting updated QAT2.0 OOT driver. This release has support for 6.x kernels. I expect this release to be available by 3/15 at: https://www.intel.com/content/www/us/en/download/765501.html.

The new release is: QAT20.L.1.1.40-00018

neel-patel-1 commented 8 months ago

That's great. Thank you for letting me know