BigUglySpider / EmuLibs

Selection of libraries designed to be used with Emu projects. This was originally a Math library only, but has since been changed to hold all Emu libraries to enable consistency in changes to dependencies (such as EmuCore modifications).
https://biguglyspider.github.io/math
0 stars 0 forks source link

Inherent performance issues with current `FastMatrix` implementation #62

Closed BigUglySpider closed 2 years ago

BigUglySpider commented 2 years ago

Brief

The issues appear to lie within the abstraction of a Matrix's major Vectors - at least, this is the most sensible location that I can think of based on the additional instructions in the disassembly, and the consistency of the changes.

More specifically, the issue appears to be with construction and assignment - this is likely related to the fact that we're calling member functions of a different abstraction.

Some work should be put in to see if the abstraction can be maintained without performance costs, however it may be preferable to make FastMatrix an independent, standalone abstraction separate from FastVector to avoid this potential compromise at all before too much time is wasted.

Data

These results are from repeated tests, and thus the stated number of iterations is a per-test basis, not the total number of tests.

On a test of 500000 iterations, FastMatrix appears to be slower than DirectX::XMMATRIX's respective operations by ~1.4ms, regardless of what the operation is (e.g. Transpose(mat) has a similar overhead to Multiply(mat, mat)). Additionally, the percentage difference between tests was maintained when iterations were increased tenfold (it should be noted that this does not mean the latency increased tenfold to 14ms). This data suggests that the issue lies within assignment and/or construction of a FastMatrix. These two potential routes are under consideration as:

  1. All functions not prefixed with Assign return a new FastMatrix, reflective of all other mathematical constructs within EmuMath and, similarly, DirectXMath.
  2. Results are move-assigned to a vector of results that may be selected from at random after tests are completed (this is regardless of which test is completed). This is a small layer of obscurity in time measurements as it includes an operation not part of what is being measured, but this is done to ensure no issues with the compiler omitting iterations. Nonetheless, it does mean that our move-assignment operation could have an issue somewhere.
BigUglySpider commented 2 years ago

Addressed as of merge #63.