saxbophone / arby

Arbitrary precision arithmetic in C++, even at compile-time
https://saxbophone.com/arby/
Mozilla Public License 2.0
8 stars 1 forks source link

Math support: pow() --use log for efficiency increase #123

Open saxbophone opened 2 years ago

saxbophone commented 2 years ago

Current code for pow() uses divide-and-conquer recursion to avoid a degenerate-case linear recursion, which is very very very slow for large exponents.

However, we still manually repeatedly divide the exponent by $2$ to work out how many powers of $2$ this is, roughly. We then use this information to use the laws of powers to combine smaller powers into increasingly larger powers.

This repeated division is equivalent to calculating the base-2 logarithm, $log{2}(b)$. We should consider using our ilog() function to do this instead.

Whenever our exponent is not an exact power of $2$, there will be an additional bit that needs to be dealt with to get the full result.

saxbophone commented 2 years ago

This is related to tetration, the fourth-order hyper-operation.

Note: it's unclear if we can gain any actual performance benefit by going down this route, so far all I can prove in my head is that we can use ilog() to pre-calculate how many iterations of repeated exponentiation we need, however this adds an additional loop to our function, as we still have to iteratively apply the exponentiations ourselves...

saxbophone commented 2 years ago

Having said this, the current implementation uses recursion, which is vulnerable to stack overflow on very large inputs. If we go ahead with our ilog()-based improvement, we convert the recursion to iteration by substituting most exponentiations to repeated multiplication within a loop, rather than recursive calls to self.

We might still have a few recursive calls for the base cases and to handle the "extra bit" that doesn't fit within a power of 2, but this will be a minority of calls.

saxbophone commented 2 years ago

We could also get rid of repeated calls to divmod() in making this change, which would be a significant performance increase, assuming we don't do significant further division as a result of its removal.

Addendum: ilog() only uses multiplication and comparison, so there's no division performance issues, at least not caused by using ilog()...

saxbophone commented 2 years ago

I'm pretty sure it is hyper-powers/tetration that we can use here, initial simple proof (where left-superscript defines tetration):

$$ 2^8 = 2^{(2^3)} = 2^\prescript{3}{}{2} = ((2^2)^2)^2 = 256 $$

Attempting to generalise with a base that differs from 2 and any intermediate exponents:

$$ 5^8 = 5^{(2^3)} = 5^\prescript{3}{}{2} = ((5^2)^2)^2 = 390625 $$

I'm not sure this is actually true tetration in the literal sense, but it is certainly related. Psuedocode for generalised algorithm:

with b as base and x as exponent, raise b to the power of x as follows:

p = log2(x)

a = b
for n in range(p):
    a = a * a

return a

Note: this algorithm doesn't fully work when log is an integer log function, but that can be factored easily enough. It demonstrates that the main iterative part of the algorithm uses no division, only multiplication.

Handling the left over bit (assuming p as floor-log is used) will simply require calculating the $b^x$ of the leftover bit and multiplying what we have so far by it. This is the only part of the algorithm that is recursive, but it is tail-recursive. We could also curtail the recursion by switching to an iterative approach when $x$ is small enough, or we could unroll the recursion into further iteration.