Definition of latency - Githubissues

travisdowns commented 5 years ago

What is the definition of latency that you want to use exactly?

In particular, consider a hypothetical operation foo arg1, arg2, arg3 which is 3p0. This uop will have a throughput of 3 due to p0 pressure. Can this op have any latency less than 3? I think yes.

For example, the op might only have a 1 cycle delay from arg2->arg1, because the two uops only uses arg3, and then the second uop uses arg2 and arg3.

However testing back-to-back foo ops will never show it because of the throughput limit. I think you are probably well aware of this since I notice lots of filler uops in tests, like:

   0:   c4 42 38 f2 ca          andn   r9d,r8d,r10d
   5:   4d 63 c1                movsxd r8,r9d
   8:   4d 63 c8                movsxd r9,r8d
   b:   4d 63 c1                movsxd r8,r9d

All the movsxd given enough breathing room to avoid lots of problems of this type.

However, consider gathers. For 1->1 latency testing this is used:

vpgatherdd ymm0,DWORD PTR [r14+ymm14*1],ymm1

No breathing room, so all these results just end up reporting the throughput number (5 in this case).

The following test:

vpgatherdd ymm0,DWORD PTR [r14+ymm14*1],ymm1
vpor ymm0,ymm0,ymm0
vpor ymm0,ymm0,ymm0
vpor ymm0,ymm0,ymm0
vpor ymm0,ymm0,ymm0

also runs in 5 cycles, so we see the true 1->1 latency is 1 cycle.

andreas-abel commented 4 years ago

This should be fixed with the latest update.

travisdowns commented 4 years ago

Thanks!

In general would we expect the number on uops.info to be updated with each nanobench update? I assume you may not have access to all the machines, so I'm not sure.

andreas-abel commented 4 years ago

I currently do have access to all the machines, so I re-ran the tests on all of them. However, there is no guarantee that this will still be the case with future updates.

travisdowns commented 4 years ago

Thanks, is there a place in the uops.info output we can look to see which version/build of nanobench was used?

andreas-abel commented 4 years ago

No. However, the XML file contains the date when it was generated, which should make it possible to find the corresponding version.

Also, I should point out that nanoBench is just the tool that runs the microbenchmarks. The tool that generates them is not public yet. With "update" above I was referring to the update of the website.

travisdowns commented 4 years ago

Thanks Andreas.

On Thu, Nov 7, 2019 at 7:13 PM Andreas Abel notifications@github.com wrote:

No. However, the XML file contains the date when it was generated, which should make it possible to find the corresponding version.

Also, I should point out that nanoBench is just the tool that runs the microbenchmarks. The tool that generates them is not public yet. With "update" above I was referring to the update of the website.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/andreas-abel/nanoBench/issues/6?email_source=notifications&email_token=AASKZQKP5M7TTJRIUZ4XBHLQSSVJVA5CNFSM4HZ6AI52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDOI5FI#issuecomment-551325333, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASKZQLZJ655LLPOTQE52Y3QSSVJVANCNFSM4HZ6AI5Q .

andreas-abel / nanoBench

Definition of latency #6