bwakell / Huldra

Outperforming BigInteger since 2015. Possibly providing the fastest available arbitrary-precision arithmetic integer classes purely written in Java. Because performance matters (and we like Java)!
Other
50 stars 13 forks source link

Intrinsics and other performance issues #13

Open TilmanNeumann opened 4 years ago

TilmanNeumann commented 4 years ago

Hi Simon, I love your project, but I'ld like to point out that it is not beating Java's BigInteger in every respect.

Since Java 8, BigInteger may use intrinsics, say optimized assembler code, and that is hard to beat.

With intrinsics, BigInteger.multiplyToLen() is about twice as fast as your "naive multiplication" implementation.

BigInteger.squareToLen() has been prepared for intrinsics in Java8, but on my platform (x86, Windows) the assembler code has not been added before Java9. If intrinsics are present, the squaring code is even better than the multiplication intrinsics.

It seems to be possible to call the intrinsic methods by reflection without notable performance loss. But since your implementation is little-endian and Java uses big-endian encoding, reverting arrays before and after the invocation is required.


Another performance issue is division. Java's BigInteger implements the Burnikel-Ziegler algorithm for larger arguments, and my tests suggest that it is better than your Knuth implementation for inputs > 10000 bits.

bwakell commented 4 years ago

Yeah the project was originally developed comparing against Java 7 (Java 8 had not been around for long at the time). And it seems that quite some improvements were done in Java 8, e.g. using Karatsuba (maybe the better division algorithm was also part of that?).

Could be worth exploring using the intrinsics. One issue/tricky case that could arise is that if the intrinsic is not available, then we would be reversing arrays in vain. :/


And yeah we should try to use the better division algorithm (and we should also try to have Toom-Cook and FFT), though I doubt I'll get around to do it in the next N years. ^^

TilmanNeumann commented 4 years ago

Could be worth exploring using the intrinsics. One issue/tricky case that could arise is that if the intrinsic is not available, then we would be reversing arrays in vain. :/

The reversals could be avoided if your BigInt were in big-endian encoding (which is admittedly quite ugly), but they are not the only problem. If the intrinsics are not present then the slow BigInteger Java code will be executed. So I think one should either let the user configure if he wants to call the BigInteger methods, or the program would have to do some diagnostics at startup, i.e. compare the performance of your implementation with the Java implementation.

Another point is that multiplyToLen() is private. So we have to make that method accessible before we can call it. Java9 and above will complain that this is an illegal method access and that the possibility might be removed in future releases. Thus the reflection approach is not future-safe.

https://www.slideshare.net/RednaxelaFX/green-teajug-hotspotintrinsics02232013 gives a few different options. One could create one's own intrinsics, but that requires to build the Hotspot VM. Graal might be a way to get around without that.

Whatever approach one chooses, it seems that it will need a huge effort to get better than Java in any respect...