This repo is for benchmarking various trailing zero removal algorithms for their uses in Dragonbox. Discussions of the algorithms listed here can be found here.
Here is the benchmark result I got (on 04/20/2024, on Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz, Windows 10 Laptop):
Algorithms | Average time consumed per a sample |
---|---|
Null (baseline) | 1.4035ns |
Naïve | 12.7084ns |
Granlund-Montgomery | 11.8153ns |
Lemire | 12.2671ns |
Generalized Granlund-Montgomery | 11.2075ns |
Naïve 2-1 | 8.92781ns |
Granlund-Montgomery 2-1 | 7.85643ns |
Lemire 2-1 | 7.60924ns |
Generalized Granlund-Montgomery 2-1 | 7.85875ns |
Naïve branchless | 3.30768ns |
Granlund-Montgomery branchless | 2.52126ns |
Lemire branchless | 2.71366ns |
Generalized Granlund-Montgomery branchless | 2.51748ns |
Algorithms | Average time consumed per a sample |
---|---|
Null (baseline) | 1.68744ns |
Naïve | 16.5861ns |
Granlund-Montgomery | 14.1657ns |
Lemire | 14.3427ns |
Generalized Granlund-Montgomery | 15.0626ns |
Naïve 2-1 | 13.2377ns |
Granlund-Montgomery 2-1 | 11.3316ns |
Lemire 2-1 | 11.6016ns |
Generalized Granlund-Montgomery 2-1 | 11.8173ns |
Naïve 8-2-1 | 12.5984ns |
Granlund-Montgomery 8-2-1 | 11.0704ns |
Lemire 8-2-1 | 13.3804ns |
Generalized Granlund-Montgomery 8-2-1 | 11.1482ns |
Naïve branchless | 5.68382ns |
Granlund-Montgomery branchless | 4.0157ns |
Lemire branchless | 4.92971ns |
Generalized Granlund-Montgomery branchless | 4.64833ns |
Notes.
See the BUILDING document.
See the CONTRIBUTING document.
All code is licensed under Boost Software License Version 1.0 (LICENSE-Boost or https://www.boost.org/LICENSE_1_0.txt).