This PR implements various performance optimizations.

Changes

Removed ISettableGridView.Fill extension method and replaced it with a native interface method with a default implementation
- Closes #76
Added ISettableGridView.Clear method to interface
Modified .Positions() enumerator functions to return and use a custom iterator
- The custom iterator has a ToEnumerable() function you can use to get an IEnumerable<Point> for LINQ usage, etc

Notes and Benchmarks

Grid View Fill and Clear

The ISettableGridView.Fill method currently in the integration library is convenient, but has a number of performance issues. First, it really just calls ApplyOverlay, which uses a Func to specify the value to set; this adds many function calls which slows the process down. Second, and more importantly, some grid view types like ArrayView<T> and BitArrayView can have their underlying data structures filled or cleared in a much more efficient way than simply iterating over all positions and assigning a value.

The current implementation of ISettableGridView.Fill uses a special case for BitArray to perform the operation faster for BitArrayView; but does nothing for ArrayView. Furthermore, special-cases for various types is not a very flexible system for controlling this. Therefore, this PR removes the extension method, and adds it as a method of ISettableGridView directly, so that implementers can provide their own implementations specific to their type.

A default interface implementation is provided in order to minimize impact to the existing API. Additionally, the "base" classes for implementing your own grid views (SettableGridViewBase, etc) also provide implementations, so if custom classes inherit from these, there will be no additional work. This allows BitArrayView and array view variations to provide a more efficient implementation.

Similarly, a Clear function has been added to the ISettableGridView interface, since clear can sometimes be implemented more efficiently than Fill.

The following performance tests outline the performance increases yielded as a result of this change on types which have the potential to implement fill/clear operations more efficiently:

Method	Size	Type	Mean	Error	StdDev
Clear	10	ArrayView	7.518 ns	0.1018 ns	0.0952 ns
Fill	10	ArrayView	347.121 ns	4.7932 ns	4.4836 ns
OldFill	10	ArrayView	1,096.287 ns	9.4300 ns	8.3594 ns
ManualFill	10	ArrayView	517.721 ns	8.5281 ns	7.9771 ns
Clear	10	ArrayView2D	7.683 ns	0.1191 ns	0.1114 ns
Fill	10	ArrayView2D	1,422.502 ns	22.7307 ns	27.9153 ns
OldFill	10	ArrayView2D	1,612.172 ns	10.8591 ns	10.1576 ns
ManualFill	10	ArrayView2D	1,639.171 ns	13.0407 ns	10.1813 ns
Clear	10	BitArrayView	5.829 ns	0.0853 ns	0.0756 ns
Fill	10	BitArrayView	5.427 ns	0.0498 ns	0.0416 ns
OldFill	10	BitArrayView	5.517 ns	0.0955 ns	0.0847 ns
ManualFill	10	BitArrayView	603.479 ns	5.5270 ns	4.6153 ns
Clear	100	ArrayView	89.358 ns	0.3181 ns	0.2483 ns
Fill	100	ArrayView	33,291.069 ns	550.4695 ns	487.9768 ns
OldFill	100	ArrayView	101,582.738 ns	675.7490 ns	632.0961 ns
ManualFill	100	ArrayView	50,033.891 ns	390.2617 ns	345.9569 ns
Clear	100	ArrayView2D	87.728 ns	0.4218 ns	0.3522 ns
Fill	100	ArrayView2D	137,495.269 ns	478.7204 ns	399.7533 ns
OldFill	100	ArrayView2D	151,239.834 ns	456.4445 ns	356.3618 ns
ManualFill	100	ArrayView2D	162,698.369 ns	1,339.3460 ns	1,187.2954 ns
Clear	100	BitArrayView	23.143 ns	0.1230 ns	0.1151 ns
Fill	100	BitArrayView	18.798 ns	0.0725 ns	0.0679 ns
OldFill	100	BitArrayView	19.321 ns	0.0733 ns	0.0650 ns
ManualFill	100	BitArrayView	58,734.706 ns	410.5886 ns	384.0648 ns
Clear	200	ArrayView	612.813 ns	5.2026 ns	4.8665 ns
Fill	200	ArrayView	133,094.646 ns	1,018.6014 ns	902.9636 ns
OldFill	200	ArrayView	407,197.750 ns	2,306.1648 ns	2,044.3550 ns
ManualFill	200	ArrayView	199,615.820 ns	1,626.6210 ns	1,358.3022 ns
Clear	200	ArrayView2D	608.558 ns	12.1963 ns	12.5247 ns
Fill	200	ArrayView2D	552,939.521 ns	2,164.6341 ns	1,807.5675 ns
OldFill	200	ArrayView2D	612,126.322 ns	3,703.7907 ns	3,464.5281 ns
ManualFill	200	ArrayView2D	645,408.421 ns	3,129.0164 ns	2,612.8704 ns
Clear	200	BitArrayView	53.136 ns	0.2383 ns	0.1861 ns
Fill	200	BitArrayView	66.284 ns	0.2616 ns	0.2319 ns
OldFill	200	BitArrayView	66.551 ns	0.2705 ns	0.2259 ns
ManualFill	200	BitArrayView	233,529.564 ns	2,152.5208 ns	2,013.4693 ns

Clear and Fill represent this PR's implementation of those functions. OldFill represents the old extension method. ManualFill represents the result of using a set of for loops to just set the new value to each location (the default implementation of the interface method).

As evidence by the above tests, the current Fill and Clear implementations far out-perform the current implementations on these types.

Note that the OldFill function has the same special case for BitArrayView as the old implementation did; this is why the performance gains aren't as evident on this type.

Positions Enumerables

Currently, Rectangle and IGridView both provide a Positions function of some sort to iterate over all positions in them. This is implemented by simply using yield return to create an IEnumerable<Point>. However, since points are small value types, the overhead of enumerables/generators comes into play heavily here, particularly if the body of the loop is relatively fast.

This PR, therefore, implements a custom enumerator for enumerating over the positions in a rectangular area, and uses that enumerator to implement the Positions() functions for both Rectangle and IGridView. The new Positions function will work identically in a foreach loop, but it's a struct and therefore avoids much of the boxing and allocation that has to take place in the case of the IEnumerable implementation.

One result of this change is that you can no longer use System.Linq functions directly on the result of the Positions() function. For this use case, the enumerator which is now returned by that function has a ToEnumerator() function; so where before you might do myGridView.Positions().Where(p => p.X == 10), you can now do myGridView.Positions().ToEnumerable().Where(p => p.X == 10).

Also note that ToEnumerable() is implemented slightly differently (internally) than the previous Positions() function was; the new way is slightly better performance wise.

The following outlines the results of performance tests on this new implementation vs the old one:

Method	Size	Mean	Error	StdDev
ManualPositionsIterationNormal	10	209.19 ns	1.035 ns	0.808 ns
ManualPositionsIterationCacheWidthHeight	10	41.34 ns	0.204 ns	0.181 ns
ManualPositionsIterationCountNormal	10	768.99 ns	5.721 ns	5.071 ns
ManualPositionsIterationCountCacheCountAndWidth	10	364.11 ns	1.875 ns	1.565 ns
PositionsIteration	10	162.65 ns	0.772 ns	0.722 ns
OldEnumerablePositionsIterationNormal	10	777.74 ns	7.341 ns	6.866 ns
OldEnumerablePositionsIterationCacheWidthHeight	10	654.60 ns	3.705 ns	3.285 ns
PositionsToEnumerableIteration	10	638.97 ns	2.324 ns	1.815 ns
PositionsToEnumerableIterationShortcutMethod	10	731.70 ns	1.989 ns	1.763 ns
ManualPositionsIterationNormal	100	20,835.91 ns	157.029 ns	146.885 ns
ManualPositionsIterationCacheWidthHeight	100	4,705.53 ns	17.613 ns	15.613 ns
ManualPositionsIterationCountNormal	100	75,506.37 ns	412.176 ns	365.384 ns
ManualPositionsIterationCountCacheCountAndWidth	100	35,312.72 ns	170.125 ns	159.135 ns
PositionsIteration	100	15,450.15 ns	168.212 ns	140.464 ns
OldEnumerablePositionsIterationNormal	100	73,348.51 ns	464.810 ns	434.784 ns
OldEnumerablePositionsIterationCacheWidthHeight	100	63,980.28 ns	207.250 ns	161.807 ns
PositionsToEnumerableIteration	100	61,171.52 ns	175.065 ns	163.756 ns
PositionsToEnumerableIterationShortcutMethod	100	70,518.70 ns	154.078 ns	128.662 ns
ManualPositionsIterationNormal	200	74,758.23 ns	809.573 ns	757.275 ns
ManualPositionsIterationCacheWidthHeight	200	18,763.24 ns	29.602 ns	27.689 ns
ManualPositionsIterationCountNormal	200	301,921.13 ns	1,477.911 ns	1,310.130 ns
ManualPositionsIterationCountCacheCountAndWidth	200	140,903.36 ns	634.363 ns	529.722 ns
PositionsIteration	200	60,929.27 ns	173.272 ns	162.079 ns
OldEnumerablePositionsIterationNormal	200	292,128.97 ns	1,353.992 ns	1,130.645 ns
OldEnumerablePositionsIterationCacheWidthHeight	200	254,856.63 ns	680.767 ns	568.471 ns
PositionsToEnumerableIteration	200	244,469.45 ns	947.353 ns	886.155 ns
PositionsToEnumerableIterationShortcutMethod	200	282,435.41 ns	990.563 ns	827.165 ns

Brief descriptions of and observations about the tests above:

PositionsIteration: Tests which use this PR's implementation of Positions(), which uses the custom enumerator.
ManualPositionsIteration*: Tests which use a set of for loops (or a single in the case of the Count tests instead of an enumerator of any sort. the Cache* tests use loops where the Width and Height are cached, instead of baked into the loop conditional (see source code for details).
OldEnumerablePositions*: Tests which use the old implementation of the Positions() functions, which uses yield return and IEnumerable<Point>. As before, the Cache versions simply cache width/height values as appropriate instead of calling on the property of the grid view in the loop condition.
PositionsToEnumerableIteration: Tests which use the custom enumerable's ToEnumerable function to work with an IEnumerable. This represents the new approach a user would take if they specifically need an IEnumerable for System.Linq or similar.
PositionsToEnumerableIterationShortcutMethod: Similar to PositionsToEnumerableIteration, but uses an alternate method which proves slower (for reference as to why the current implementation is the way it is).

Notably, the PositionsIteration test, eg. the one that uses this PR's custom enumerator, is faster than nearly everything else; including ManualPositionsIterationNormal, which I would argue represents the most common way a user would implement the loops manually. You can, of course, achieve superior performance with manual for loops by caching the width/height; but this is to be expected (no enumerator in C# has 0 overhead). For reference, the new Positions() implementation is about 6-8x faster than the old one in these test cases.

Breaking Changes

This PR minimizes breaking changes, but there are a few limited cases where some minor breaking can occur.

First, if a user has a custom implementation of ISettableGridView, which does not inherit from one of the provided base classes, and they also call Fill on that view without it being casted to ISettableGridView, then they will have to add that function to their implementation. This should be extremely straightforward, and should only apply in a very limited subset of cases.

Second, any use case of Positions() functions which depend on the result being IEnumerable<Point> will require slight modifications to account for this update. These use cases would include anything operating on the Positions() result using System.Linq, and perhaps some custom interfaces. myObj.Positions() must now be written as myObj.Positions().ToEnumerable() in these cases. This change should be straightforward as well, however, and simple foreach usages (arguably the most common) obviously will not need this change.

Although I typically don't like introducing these types of breaking changes (although very minor ones) within a minor version, these changes do provide a significant performance boost as outlined above, so I believe they should be worth it here given the minimal impact to the API.

thesadrogue / TheSadRogue.Primitives

Optimizations #79

Changes

Notes and Benchmarks

Grid View Fill and Clear

Positions Enumerables

Breaking Changes