Closed kimwalisch closed 8 years ago
I want to release a new version of primecount this weekend and I want to start using libdivide on Windows/MSVC as well. So I will merge this pull request now.
Here are some benchmarks for libdivide with __mulh(), __umulh()
intrinsics for MSVC 2015 x64:
# primecount without libdivide, slowest
> primecount.exe 1e16 --S2_easy -s
=== S2_easy(x, y) ===
Computation of the easy special leaves
x = 10000000000000000
y = 4117019
z = 2428941911
c = 6
alpha = 19.110
threads = 4
Status: 100%
S2_easy = 63933848726803
Seconds: 4.635
# primecount with libdivide, but without __mulh(), __umulh()
> primecount.exe 1e16 --S2_easy -s
=== S2_easy(x, y) ===
Computation of the easy special leaves
x = 10000000000000000
y = 4117019
z = 2428941911
c = 6
alpha = 19.110
threads = 4
Status: 100%
S2_easy = 63933848726803
Seconds: 3.905
# primecount with libdivide and with __mulh(), __umulh(), fastest
> primecount.exe 1e16 --S2_easy -s
=== S2_easy(x, y) ===
Computation of the easy special leaves
x = 10000000000000000
y = 4117019
z = 2428941911
c = 6
alpha = 19.110
threads = 4
Status: 100%
S2_easy = 63933848726803
Seconds: 3.175
This pull request implements the enhancement suggested in https://github.com/ridiculousfish/libdivide/issues/19. As you can see below I measured a small speed up using your test program: