sageserpent-open / kineticMerge

Merge a heavily refactored codebase and stay sane.
MIT License
9 stars 1 forks source link

Go faster! #25

Closed sageserpent-open closed 4 months ago

sageserpent-open commented 4 months ago

Performing the merge commit in Americium: https://github.com/sageserpent-open/americium/commit/576c7636dad94e5b5aeb7987b7badccfc707a5e0 yields a perfectly good result, but takes about 3 minutes on a reasonably sprightly Macbook.

The overwhelming majority of this time is spent looking for very long matches across the commits for a single big file. The search algorithm efficiently homes in on the optimal match sizes, but the match computations take a long time - could be the fingerprinting, could be the matching of fingerprints...

Need to find out the bottleneck and either optimise it or parallelise execution - inspection of the running process with Visual VM shows that single threaded execution only uses about 6 or 7 % of the available capacity.

sageserpent-open commented 4 months ago

Seen in version 0.2.22.

sageserpent-open commented 4 months ago

Some good news, though - GC activity is very lean, and memory usage stays put most of the time at around 480 MB of heap size with the used heap fluctuating beween 100 MB and 350MB.

sageserpent-open commented 4 months ago

Screenshot 2024-02-09 at 13 31 36

Looking at the CPU samples, there is roughly 180 000 milliseconds of time, of which 50 000 is spent in MatchesAndTheirSections.matchingFingerprintsAcrossSides and 130 000 in MatchesAndTheirSections.fingerprintSections.

So it is the fingerprinting that takes the lion's share of the time.

Looking down the time breakdown of the lower-level function calls, things stay at around 120 000 to 130 000 milliseconds until we hit the call to BigInt.mod - from there on the time is unaccounted for below java.math.BigInteger.mod, but the thing is that this seems to be a huge timesink.

Let's redo the profiling with more libraries under scrutiny...

sageserpent-open commented 4 months ago

Let's include the Java runtime library and zoom in:

Screenshot 2024-02-09 at 13 50 01

That 48% of self-time claimed by MutableBigInteger.mulsub is by far the largest self-time. The next in line is 11% claimed by MutableBigInteger.divWord. Both are invoked by our friend BigInt.mod.

So the use of BigInt in the rolling polynomial hash comes with a hefty price tag!

sageserpent-open commented 4 months ago

BigInt can use Long internally if the values aren't too large. The highest scale power was going beyond this, but it can be calculated upfront using modulo arithmetic, as it is folded into a modulo arithmetic calculation later anyway.

Doing this in Git commit SHA: 5be533b987b9915061671614dbd1bdc8351d4ed0 yields stopwatch times for the original merge of: 58 seconds, 56 seconds, 55 seconds - so a definite improvement.

sageserpent-open commented 4 months ago

Can another round of sampling help... ?

sageserpent-open commented 4 months ago

Comparing PotentialMatchKey instances takes a big slice of time. This is inevitable, but we can improve matters with Git commit SHA: 912a5571a1a01a5b0175cb0da4d30a2cb2e77ba3, taking the run time of the example above from ~56 seconds to ~47 seconds.

sageserpent-open commented 4 months ago

This went out in release 0.2.3, Git commit SHA: 5796c39ad225085795a082e2d25cae722c8326ea.