andre-richter / pcie-lat

Generic x86_64 PCIe latency measurement module for the Linux kernel
GNU General Public License v2.0
56 stars 15 forks source link

very strang latency result #4

Open lcxone1 opened 4 years ago

lcxone1 commented 4 years ago

Hi Andre, I am using 2 x86 Intel CPUs and one microsemi PCIe switch between them. Ubuntu 18.04 is running on both CPUs. I am sure that the PCIe-switch is right configured and PCIe driver on both CPU are also correct. But I got the following results which is very strange. Do you have some ideas about that? Thx!

TSC freq: 2095077000.0 Hz TSC overhead: 32 cycles Device: b3:00.1 BAR: 0 Offset: 0x0 Loops: 100000

   | Results (100000 samples)

Mean | 32612.38 cycles | 15566.20 ns Stdd | 47215.27 cycles | 22536.29 ns

   | 3σ Results (92867 samples, 0.071% discarded)

Mean | 19535.33 cycles | 9324.40 ns Stdd | 1600.78 cycles | 764.07 ns

andre-richter commented 4 years ago

It's been a very long time since I last worked on those topics. You basically get 9 µs delay for CPU-to-CPU PCIe reads channeled through a Microsemi switch?

That sounds awfully slow, true.

What is the BAR and register you are reading from? Does the time differ when you read from a different Offset?

lcxone1 commented 4 years ago

I tried to use BAR2 and the result looks much better: `TSC freq: 2095077000.0 Hz TSC overhead: 32 cycles Device: b3:00.1 BAR: 2 Offset: 0x0 Loops: 1000000

   | Results (1000000 samples)

Mean | 1667.17 cycles | 795.75 ns Stdd | 123.97 cycles | 59.17 ns

   | 3σ Results (971452 samples, 0.029% discarded)

Mean | 1648.78 cycles | 786.98 ns Stdd | 30.49 cycles | 14.55 ns ` Then I looked into the PCIe Switch settings and I found out only the BAR2 is enabled for the Non-transparent direct window. So I assume BAR2 is the right one and so this result should make sense. Thx for the hint!

However the latency result from Microsemi Switch programm is between 85ns and 140ns, could you tell me which latency do I have here(786.98 ns)? Is it the latency of CPU1 ->PCIe Switch->CPU2 -> PCIe Switch -> CPU1 or CPU1->PCIe Switch ->CPU1?

andre-richter commented 4 years ago

However the latency result from Microsemi Switch programm is between 85ns and 140ns, could you tell me which latency do I have here(786.98 ns)? Is it the latency of CPU1 ->PCIe Switch->CPU2 -> PCIe Switch -> CPU1 or CPU1->PCIe Switch ->CPU1?

I fear I cannot help much here without knowing lots of low-level details of your setup, sorry :(