apple / swift-numerics

Advanced mathematical types and functions for Swift
Apache License 2.0
1.69k stars 147 forks source link

[BigInt tests][No merge] 🐰 How NOT to write performance tests #256

Open LiarPrincess opened 1 year ago

LiarPrincess commented 1 year ago

Please read the #242 Using tests from “Violet - Python VM written in Swift” before.


=== DO NOT MERGE! Discussion only. ===

🐰 Discussion

Those are the all of the basic operations. In the future we may add some more targeted tests for interesting operations (mul/div), but for now this should be enough.

It may also be a good idea to move all of the performance tests to a separate test target. Currently I am using PERFORMANCE_TEST compilation flag.

You can use Violet as a baseline measure, thought Violet was heavily optimized for Int32 range (see documentation). I could easily write something faster, but this is not my use-case.

String parsing

That said, I have to say that I find the String parsing performance quite bad. From my tests Violet is ~30 times faster.

Violet secret:

Instead of using a single BigInt and multiplying it by radix, we will group scalars into words-sized chunks. Then we will raise those chunks to appropriate power and add together.

For example: 1_2345_6789 = (1 * 10^8) + (2345 * 10^4) + (6789 * 10^0)

So, we are doing most of our calculations in fast Word, and then we switch to slow BigInt for a few final operations.

Implemented here. Should I create an issue for this?

Mac

I have 2014 rMBP -> mac 11.7 (Big Sur), Xcode 13.2.1, Intel. I assume that 9 years old machines are not exactly a priority for 🍎. Executing all of those tests take ~30 min (serial execution, DEBUG), and I can't be bothered to re-run them properly since it will throttle after a few minutes anyway…

Linux

Normally it would fail to compile:

PerformanceTests.generated.swift:27:23: error: cannot find type 'XCTMetric' in scope
private let metrics: [XCTMetric] = [XCTClockMetric()] // XCTMemoryMetric()?
                      ^~~~~~~~~

PerformanceTests.generated.swift:28:23: error: cannot find 'XCTMeasureOptions' in scope
private let options = XCTMeasureOptions.default
                      ^~~~~~~~~~~~~~~~~

But I wrote my own thing. Results below.

LiarPrincess commented 1 year ago

I wrote a test script that runs multiple BigInt implementations:

Platform:

Important:

string

1_string

This test involved String operations on 1203 BigInts.

Parsing from string:

To string:

equal/compare

2_eq_cmp

This test involved ==/< operations on 1_447_209 BigInt pairs.

unary

3_unary

This test involved +-~ operations on 100_203 BigInts.

Plus:

Minus:

Invert:

add, sub

4_add_sub

This test involved +- operations on 162_409 BigInt pairs.

Add:

Sub:

mul, div, mod

5_mul_div_mod

This test involved */% operations on 91_809 BigInt pairs.

Mul:

Div/mod:

and, or, xor

6_and_or_xor

This test involved &|^ operations on 162_409 BigInt pairs.

shift

7_shift

This test involved shifting 20_203 BigInts by 5 values, in total 101_015 shifts per test.

Left:

Right:

pi

This test was suggested by Xiaodi Wu (@xwu) in BigInt #120, so we may ask them if we have any questions (maybe?). In theory it looks interesting, because it contains +-*/ operations.

In practice count 5000 has following input distribution (more detailed results here):

Operation Count Notes
add 39_280 Inputs of similar size up to 3500 words (3500*UInt64 in magnitude)
sub 5_000 Mostly inputs of similar size up to 3500 words
Some inputs where rhs is a single word (UInt64)
mul 104_086 lhs goes up to 3522 words
rhs is always a single word (!!!)
div 22_678 Inputs of similar size up to 3500 words
compare (>) 16_602 Inputs of similar size up to 3500 words
So fast that it does not matter

Anyway, here are the results:

8_pi

Results (looking at the 5000 test):

Violet XsProMax being slower than Violet is a surprise because in most of the tests above XsProMax was noticeably faster (5-10%). Though, Violet has an ace up its sleeve: it is really fast for small integers and this test performs 104_086 multiplications where rhs falls into Int32. This was not visible in the tests above because they concentrated on big numbers.

As for the swift_numerics vs attaswift: result is dominated by */. Other operations (namely: +-) are so few/fast that they are not even noticeable.

TLDR; Conclusions

Operation Possible improvement Note
init(String) 190x for radix 10
345x when radix is a power of 2
Discussed in initial post
toString 21.5x for radix 10
>2500x when radix is a power of 2 (up to 3210x)
Radix 16 is slower than radix 10?
Equality -
Compare -
Unary + - Do not write by hand, it is 1000x slower
Unary - 4x Storing sign inline and magnitude on the heap is much faster (obviously, but now we have numbers)
Unary ~ -
Binary + 3x
Binary - 4.4x Implemented as a-b = a+(-b), writing it by hand may be faster (Violet does this)
Binary * 1.45x
Binary /% 2.4x
Binary &\|^ 2x
Left shift 17x
Right shift 20x
pi 1.6x
2.6x if we store small values inline
This test uses a lot of small integers.
*/ dominate +-.
LiarPrincess commented 1 year ago

Update 21.2.2023:


Anyway, from those tests we can see that Violet is the fastest. Under the hood Violet uses ManagedBufferPointer. If swift_numerics decides to go with this route:

Obviously you can quite easily write something that will be faster than Violet. Violet is heavily geared towards small integers (Int32) which closes certain optimization opportunities.

LiarPrincess commented 1 year ago

Update 20.3.2023: