use isZero instead of comparing with BigInteger.ZERO (I forgot this one on the other PRs...)
skip the extendedSignificand when using scale.
[!Important]
I understand that when using scale, BigDecimal(BigInteger(123), exponent, decimalMode) == BigDecimal(BigInteger(12300000), exponent, decimalMode), no matter the exponent or decimalMode. If this is true, then we don't need to calculate the 'pow' and this PR is relevant. Please be aware here I'm not very experimented so that's a supposition.
Many methods are impacted by this method, as I don't have much time right now I've only experimented add with BigDecimals using scale (performance without scale would be roughly similar), but I guess many other methods may have roughly the same improvement.
Benchmark on JVM, best of 3 runs, M3 Pro 36GB, 10 million iterations, with a warmup loop for JVM (starttime and JIT excluded for a more stable duration).
Code to reproduce:
@Test
fun perfAdd1() {
val a = BigDecimal.parseString("1.01").scale(2)
val b = BigDecimal.parseString("2.22").scale(2)
repeat(1_000_000) { // JVM Warmup
a.add(b)
}
val duration = measureTime {
repeat(10_000_000) {
a.add(b)
}
}
println("Duration: $duration")
}
Method
BEFORE
AFTER
% time reduction
add
12.0
11.4
5%
Important note here, this is computed on main branch as the other PRs are not yet validated. On my fork, I've merged them all on a temporary branch and re-run the same benchmark. Savings are way better and then this optimisation is saving 45% of the time when using the scale.
Method
BEFORE
AFTER
% time reduction
add
2.17
1.19
45%
Eventually with the 5 PRs we're passing from 12s to 1.19s, so a total 90% time reduction 🎉
Many methods are impacted by this method, as I don't have much time right now I've only experimented
add
with BigDecimals using scale (performance without scale would be roughly similar), but I guess many other methods may have roughly the same improvement.Benchmark on JVM, best of 3 runs, M3 Pro 36GB, 10 million iterations, with a warmup loop for JVM (starttime and JIT excluded for a more stable duration).
Code to reproduce:
Important note here, this is computed on main branch as the other PRs are not yet validated. On my fork, I've merged them all on a temporary branch and re-run the same benchmark. Savings are way better and then this optimisation is saving 45% of the time when using the scale.
Eventually with the 5 PRs we're passing from 12s to 1.19s, so a total 90% time reduction 🎉
Raw benchmark data
(from fork branch with all improvements)