thesadrogue / TheSadRogue.Primitives

A collection of primitive data structures for working with a 2-dimensional grid.
MIT License
21 stars 6 forks source link

Optimize Distance Calculations #84

Closed Chris3606 closed 2 years ago

Chris3606 commented 2 years ago

Currently, the Distance.Calculate(double dx, double dy) function takes double values for dx and dy.

First, the current distance calculation must account for the potential for NaN:

public static float Max(float val1, float val2) {
    if (val1 > val2)
        return val1;

    if (Single.IsNaN(val1))
        return val1;

    return val2;
}

Therefore, since NaN is not tyically a concern, some calculations such as Chebyshev could be optimized by using a custom Max method of simply dx > dy ? dy : dx.

This should be benchmarked to determine if there is a significant speedup. This is also technically a breaking change, so this should be considered before implementation.

Additionally, in practice, decimal values are almost never used on an integral grid; and performing the calculating on double values instead of integer ones may slow it down significantly.

Performing the calculation using a dedicated overload which takes ints (and does not call the double version) may also perform a speedup. This would apply to all calculations, and would not produce a breaking change.

Chris3606 commented 2 years ago

An initial set of performance tests has been created on the optimize-distance-int branch which demonstrates the potential for performance gains here:

|               Method | DeltaX | DeltaY |      Mean |     Error |    StdDev |
|--------------------- |------- |------- |----------:|----------:|----------:|
|      ChebyshevDouble |      4 |      7 | 1.0140 ns | 0.0339 ns | 0.0301 ns |
| ChebyshevDoubleNoNaN |      4 |      7 | 0.4766 ns | 0.0074 ns | 0.0058 ns |
|         ChebyshevInt |      4 |      7 | 0.2855 ns | 0.0128 ns | 0.0120 ns |

ChebyshevDouble is roughly the method which is implemented today; the NoNaN variant is the same except uses a the custom max method, and ChebyshevInt operates on integers rather than doubles.

Chris3606 commented 2 years ago
Further testing yields some inconsistent reults. Furthermore, here are results as of the current state of optimize-distance-int branch: Method DeltaX DeltaY DistanceType Mean Error StdDev
CalculateDouble 4 7 Manhattan 1.2366 ns 0.0193 ns 0.0171 ns
CalculateInt 4 7 Manhattan 0.7943 ns 0.0221 ns 0.0196 ns
CalculateDouble 4 7 Euclidean 0.7832 ns 0.0207 ns 0.0194 ns
CalculateInt 4 7 Euclidean 0.9947 ns 0.0273 ns 0.0256 ns
CalculateDouble 4 7 Chebyshev 1.0342 ns 0.0443 ns 0.0346 ns
CalculateInt 4 7 Chebyshev 1.1847 ns 0.0361 ns 0.0320 ns

The integer version appears to have no discernable difference, as these tests seem to produce notably different values on subsequent runs.

Chris3606 commented 2 years ago

There is some benefit for int-specific versions for cases where Dynamic Profile-Guided Optimization is on (only available in .NET 6+ and requires explicit actions to enable):

Method Job DeltaX DeltaY DistanceType Mean Error StdDev
Distance Default mode 4 7 Manhattan 0.4856 ns 0.0029 ns 0.0023 ns
DistanceInt Default mode 4 7 Manhattan 0.6968 ns 0.0228 ns 0.0202 ns
Distance Dynamic PGO 4 7 Manhattan 0.4123 ns 0.0098 ns 0.0087 ns
DistanceInt Dynamic PGO 4 7 Manhattan 0.2639 ns 0.0224 ns 0.0209 ns
Distance Default mode 4 7 Euclidean 0.6494 ns 0.0219 ns 0.0205 ns
DistanceInt Default mode 4 7 Euclidean 0.9353 ns 0.0194 ns 0.0182 ns
Distance Dynamic PGO 4 7 Euclidean 0.9105 ns 0.0310 ns 0.0290 ns
DistanceInt Dynamic PGO 4 7 Euclidean 0.8203 ns 0.0132 ns 0.0103 ns
Distance Default mode 4 7 Chebyshev 0.9675 ns 0.0268 ns 0.0251 ns
DistanceInt Default mode 4 7 Chebyshev 1.0098 ns 0.0196 ns 0.0183 ns
Distance Dynamic PGO 4 7 Chebyshev 0.5355 ns 0.0210 ns 0.0197 ns
DistanceInt Dynamic PGO 4 7 Chebyshev 0.2738 ns 0.0176 ns 0.0147 ns

However, the double version is also faster, and the difference between double and int versions is less than 0.3 ns. Therefore, it's probably not worth optimizing at the current time.