Closed saxbophone closed 2 years ago
After experimenting with a modified approach that divides linearly by the largest value that can be squeezed into uintmax_t
, we get the following stats (including the large commented out test case):
real 6m23.361s
user 5m49.330s
sys 0m2.260s
If this version is to be used, it needs to be modified to not pad the leading digit to max_possible
digits, but only the minimum number of digits needed for the value (i.e. don't pad).
Stats are now (in d717e85):
real 5m57.145s
user 5m43.861s
sys 0m0.861s
This is relatively good!
Not only is the version at the tip of this branch faster than the experimental recursive one, running it with code optimisation is even faster than the code-optimised recursive one. We'll definitely use the non-recursive one!
Timing statistics for the $2^{4000}$ test case (currently commented out due to runtime):
With optimisation:
Without optimisation:
This shows that while runtime for converting large numbers to decimal string is still long, this PR makes a significant improvement to it, with the exception of some experiments changing binary recursion for linear divide-and-conquer to the size of uintmax_t.
I believe further increases in efficiency are unlikely to be found in the methods changed in this PR, and might need to rely upon further efficiency improvements to the arithmetic methods. A detailed profile of the code would determine more precisely what functions are taking so much time.
Closes #67