oscbyspro / AwesomeNumbersKit

Large number arithmetic in Swift
Apache License 2.0
7 stars 1 forks source link

Faster signed multiplication #103

Closed oscbyspro closed 1 year ago

oscbyspro commented 1 year ago

Signed multiplication is currently much slower than unsigned, mostly because the signed overflow check is snail-paced.

if !Self.isSigned {
    overflow = !product.high.isZero // πŸš€
} else if self.isLessThanZero != amount.isLessThanZero {
    overflow = product < DoubleWidth(descending: HL(~Self(), Magnitude(bitPattern: Self.min))) // 🐌
} else {
    overflow = product > DoubleWidth(descending: HL( Self(), Magnitude(bitPattern: Self.max))) // 🐌
}

I believe the following is a faster, allocationless, version of the same comparsion. Although, I may have to add tests for it.

if !Self.isSigned {
    overflow = !product.high.isZero // πŸš€
}   else if self.isLessThanZero == amount.isLessThanZero {
    // overflow = product > Self.max (but more efficent)
    overflow = !product.high.isZero || product.low.mostSignificantBit // 🏎️
}   else {
    // overflow = product < Self.min (but more efficent)
    overflow = product.high.isLessThanZero && (!product.high.nonzeroBitCount == product.high.bitWidth || !product.low.mostSignificantBit) // 🏎️
}

The idea is to only check bits that are significant for detecting overflow.

oscbyspro commented 1 year ago
1. Test Case '-[ANKFullWidthKitBenchmarks. Int256BenchmarksOnMultiplication testMultipliedReportingOverflow]' passed (0.023 seconds).
1. Test Case '-[ANKFullWidthKitBenchmarks.UInt256BenchmarksOnMultiplication testMultipliedReportingOverflow]' passed (0.014 seconds).

2. Test Case '-[ANKFullWidthKitBenchmarks. Int256BenchmarksOnMultiplication testMultipliedReportingOverflow]' passed (0.017 seconds).
2. Test Case '-[ANKFullWidthKitBenchmarks.UInt256BenchmarksOnMultiplication testMultipliedReportingOverflow]' passed (0.014 seconds).
oscbyspro commented 1 year ago

The following assertions test all overflowing signed paths:

ANKAssertMultiplication(T.max, T( 1), T.max,        T( 0), false)
ANKAssertMultiplication(T.max, T(-1), T.min + T(1), T(-1), false)
ANKAssertMultiplication(T.min, T( 1), T.min,        T(-1), false)
ANKAssertMultiplication(T.min, T(-1), T.min,        T( 0), true )

ANKAssertMultiplication(T.max, T( 2), T(-2),        T( 0), true )
ANKAssertMultiplication(T.max, T(-2), T( 2),        T(-1), true )
ANKAssertMultiplication(T.min, T( 2), T( 0),        T(-1), true )
ANKAssertMultiplication(T.min, T(-2), T( 0),        T( 1), true )
oscbyspro commented 1 year ago

I find this rewrite, which leverages matches(repeating:) (#104), amazingly straight forward:

if !Self.isSigned {
    overflow = !(product.high.matches(repeating: false)) // πŸš€
}   else if self.isLessThanZero == amount.isLessThanZero {
    // overflow = product > Self.max, but more efficient
    overflow = !(product.high.matches(repeating: false) && !product.low.mostSignificantBit) // 🏎️
}   else {
    // overflow = product < Self.min, but more efficient
    overflow = !(product.high.matches(repeating: true ) &&  product.low.mostSignificantBit) && product.high.mostSignificantBit // 🏎️
}

Alternatively:

if !Self.isSigned {
    overflow = !(product.high.isZero) // πŸš€
}   else if self.isLessThanZero == amount.isLessThanZero {
    // overflow = product > Self.max, but more efficient
    overflow = !(product.high.isZero && !product.low.mostSignificantBit) // 🏎️
}   else {
    // overflow = product < Self.min, but more efficient
    overflow = !(product.high.isFull &&  product.low.mostSignificantBit) && product.high.mostSignificantBit // 🏎️
}