Closed mario-tux closed 5 years ago
Your processor looks very recent, but we don't have a lot of support for recent CPUs. Are you sure it is supported properly in MPIR? What does ./config.guess return?
Basically MPIR development ground to a halt some years ago, as we made the project community supported. So far, few people have stepped up to continue development.
I'm not sure that fat mode is a good idea if you want performance. You are best letting the system configure MPIR for your processor automatically. Fat mode will add extra overhead on every operation.
Incidentally, Intel CPU cycle counting is no longer reliable. It no longer counts CPU cycles, and the numbers you get from the counter depend on many factors. You are better off using an actual clock for timings.
Your processor looks very recent, but we don't have a lot of support for recent CPUs. Are you sure it is supported properly in MPIR? What does ./config.guess return?
This is the result with MPIR: skylakeavx-unknown-linux-gnu
. GMP recognize it as skylake-pc-linux-gnu
.
I'm not sure that fat mode is a good idea if you want performance. You are best letting the system configure MPIR for your processor automatically. Fat mode will add extra overhead on every operation.
Oh, this is new for me. I was misleaded by description: I didn't think the cpu detection and version managament was so bad. I disabled it and all the strange behaviors are gone (with better numbers too). Thanks.
Incidentally, Intel CPU cycle counting is no longer reliable. It no longer counts CPU cycles, and the numbers you get from the counter depend on many factors. You are better off using an actual clock for timings.
I studied the topic a bit: in the past there where problems but now with modern CPUs supporting instruction rdtscp
and cpu features (using Linux flag names) constant_tsc
and nonstop_tsc
all the issues should be gone. It should be the most precise method with low (sampling) overhead. I think the high precision clock implementation of all the main OSs are based on it.
I usually considered MPIR faster than GMP in almost all operations but I recently faced some disorienting benchmark of more complex operations (I was trying to implement, upon MPIR/GMP operations, a modular exponentiation with sliding window on a fixed base exploiting precomputation).
In order to investigate I did some micro-benchmarks on the elementary operations. These are the reports:
All the exponentiations are faster MPIR (as expected). What is slower in MPIR are: modular multiplication (mul+mod) and modular square (mul+mod).
Some details:
mpz_mul(n3, n1, n2)
, so non-inplace operation;mpz_mod(n3, n1, mod)
withn1
with the double size ofmod
;mpz_mul(n3, n1, n2); mpz_mod(n3, n3, mod)
; it looks slower than just the sum of the two previous tests: my guess is that it is related to the cache usage; but the main fact is that this operation is VERY slow on MPIR w.r.t. GMP!mpz_mul(n3, n1, n1); mpz_mod(n3, n3, mod)
;I would like to understand such behavior and, maybe, to get a better performance of my fixed base exponentiation using MPIR (now it very slow because of the square/multiplication steps).
Side note: why GMP/MPIR never included modular exponentiation exploiting fixed base? :)