This PR implements various performance optimizations.
Changes
Removed ISettableGridView.Fill extension method and replaced it with a native interface method with a default implementation
Closes #76
Added ISettableGridView.Clear method to interface
Modified .Positions() enumerator functions to return and use a custom iterator
The custom iterator has a ToEnumerable() function you can use to get an IEnumerable<Point> for LINQ usage, etc
Notes and Benchmarks
Grid View Fill and Clear
The ISettableGridView.Fill method currently in the integration library is convenient, but has a number of performance issues. First, it really just calls ApplyOverlay, which uses a Func to specify the value to set; this adds many function calls which slows the process down. Second, and more importantly, some grid view types like ArrayView<T> and BitArrayView can have their underlying data structures filled or cleared in a much more efficient way than simply iterating over all positions and assigning a value.
The current implementation of ISettableGridView.Fill uses a special case for BitArray to perform the operation faster for BitArrayView; but does nothing for ArrayView. Furthermore, special-cases for various types is not a very flexible system for controlling this. Therefore, this PR removes the extension method, and adds it as a method of ISettableGridView directly, so that implementers can provide their own implementations specific to their type.
A default interface implementation is provided in order to minimize impact to the existing API. Additionally, the "base" classes for implementing your own grid views (SettableGridViewBase, etc) also provide implementations, so if custom classes inherit from these, there will be no additional work. This allows BitArrayView and array view variations to provide a more efficient implementation.
Similarly, a Clear function has been added to the ISettableGridView interface, since clear can sometimes be implemented more efficiently than Fill.
The following performance tests outline the performance increases yielded as a result of this change on types which have the potential to implement fill/clear operations more efficiently:
Method
Size
Type
Mean
Error
StdDev
Clear
10
ArrayView
7.518 ns
0.1018 ns
0.0952 ns
Fill
10
ArrayView
347.121 ns
4.7932 ns
4.4836 ns
OldFill
10
ArrayView
1,096.287 ns
9.4300 ns
8.3594 ns
ManualFill
10
ArrayView
517.721 ns
8.5281 ns
7.9771 ns
Clear
10
ArrayView2D
7.683 ns
0.1191 ns
0.1114 ns
Fill
10
ArrayView2D
1,422.502 ns
22.7307 ns
27.9153 ns
OldFill
10
ArrayView2D
1,612.172 ns
10.8591 ns
10.1576 ns
ManualFill
10
ArrayView2D
1,639.171 ns
13.0407 ns
10.1813 ns
Clear
10
BitArrayView
5.829 ns
0.0853 ns
0.0756 ns
Fill
10
BitArrayView
5.427 ns
0.0498 ns
0.0416 ns
OldFill
10
BitArrayView
5.517 ns
0.0955 ns
0.0847 ns
ManualFill
10
BitArrayView
603.479 ns
5.5270 ns
4.6153 ns
Clear
100
ArrayView
89.358 ns
0.3181 ns
0.2483 ns
Fill
100
ArrayView
33,291.069 ns
550.4695 ns
487.9768 ns
OldFill
100
ArrayView
101,582.738 ns
675.7490 ns
632.0961 ns
ManualFill
100
ArrayView
50,033.891 ns
390.2617 ns
345.9569 ns
Clear
100
ArrayView2D
87.728 ns
0.4218 ns
0.3522 ns
Fill
100
ArrayView2D
137,495.269 ns
478.7204 ns
399.7533 ns
OldFill
100
ArrayView2D
151,239.834 ns
456.4445 ns
356.3618 ns
ManualFill
100
ArrayView2D
162,698.369 ns
1,339.3460 ns
1,187.2954 ns
Clear
100
BitArrayView
23.143 ns
0.1230 ns
0.1151 ns
Fill
100
BitArrayView
18.798 ns
0.0725 ns
0.0679 ns
OldFill
100
BitArrayView
19.321 ns
0.0733 ns
0.0650 ns
ManualFill
100
BitArrayView
58,734.706 ns
410.5886 ns
384.0648 ns
Clear
200
ArrayView
612.813 ns
5.2026 ns
4.8665 ns
Fill
200
ArrayView
133,094.646 ns
1,018.6014 ns
902.9636 ns
OldFill
200
ArrayView
407,197.750 ns
2,306.1648 ns
2,044.3550 ns
ManualFill
200
ArrayView
199,615.820 ns
1,626.6210 ns
1,358.3022 ns
Clear
200
ArrayView2D
608.558 ns
12.1963 ns
12.5247 ns
Fill
200
ArrayView2D
552,939.521 ns
2,164.6341 ns
1,807.5675 ns
OldFill
200
ArrayView2D
612,126.322 ns
3,703.7907 ns
3,464.5281 ns
ManualFill
200
ArrayView2D
645,408.421 ns
3,129.0164 ns
2,612.8704 ns
Clear
200
BitArrayView
53.136 ns
0.2383 ns
0.1861 ns
Fill
200
BitArrayView
66.284 ns
0.2616 ns
0.2319 ns
OldFill
200
BitArrayView
66.551 ns
0.2705 ns
0.2259 ns
ManualFill
200
BitArrayView
233,529.564 ns
2,152.5208 ns
2,013.4693 ns
Clear and Fill represent this PR's implementation of those functions. OldFill represents the old extension method. ManualFill represents the result of using a set of for loops to just set the new value to each location (the default implementation of the interface method).
As evidence by the above tests, the current Fill and Clear implementations far out-perform the current implementations on these types.
Note that the OldFill function has the same special case for BitArrayView as the old implementation did; this is why the performance gains aren't as evident on this type.
Positions Enumerables
Currently, Rectangle and IGridView both provide a Positions function of some sort to iterate over all positions in them. This is implemented by simply using yield return to create an IEnumerable<Point>. However, since points are small value types, the overhead of enumerables/generators comes into play heavily here, particularly if the body of the loop is relatively fast.
This PR, therefore, implements a custom enumerator for enumerating over the positions in a rectangular area, and uses that enumerator to implement the Positions() functions for both Rectangle and IGridView. The new Positions function will work identically in a foreach loop, but it's a struct and therefore avoids much of the boxing and allocation that has to take place in the case of the IEnumerable implementation.
One result of this change is that you can no longer use System.Linq functions directly on the result of the Positions() function. For this use case, the enumerator which is now returned by that function has a ToEnumerator() function; so where before you might do myGridView.Positions().Where(p => p.X == 10), you can now do myGridView.Positions().ToEnumerable().Where(p => p.X == 10).
Also note that ToEnumerable() is implemented slightly differently (internally) than the previous Positions() function was; the new way is slightly better performance wise.
The following outlines the results of performance tests on this new implementation vs the old one:
Method
Size
Mean
Error
StdDev
ManualPositionsIterationNormal
10
209.19 ns
1.035 ns
0.808 ns
ManualPositionsIterationCacheWidthHeight
10
41.34 ns
0.204 ns
0.181 ns
ManualPositionsIterationCountNormal
10
768.99 ns
5.721 ns
5.071 ns
ManualPositionsIterationCountCacheCountAndWidth
10
364.11 ns
1.875 ns
1.565 ns
PositionsIteration
10
162.65 ns
0.772 ns
0.722 ns
OldEnumerablePositionsIterationNormal
10
777.74 ns
7.341 ns
6.866 ns
OldEnumerablePositionsIterationCacheWidthHeight
10
654.60 ns
3.705 ns
3.285 ns
PositionsToEnumerableIteration
10
638.97 ns
2.324 ns
1.815 ns
PositionsToEnumerableIterationShortcutMethod
10
731.70 ns
1.989 ns
1.763 ns
ManualPositionsIterationNormal
100
20,835.91 ns
157.029 ns
146.885 ns
ManualPositionsIterationCacheWidthHeight
100
4,705.53 ns
17.613 ns
15.613 ns
ManualPositionsIterationCountNormal
100
75,506.37 ns
412.176 ns
365.384 ns
ManualPositionsIterationCountCacheCountAndWidth
100
35,312.72 ns
170.125 ns
159.135 ns
PositionsIteration
100
15,450.15 ns
168.212 ns
140.464 ns
OldEnumerablePositionsIterationNormal
100
73,348.51 ns
464.810 ns
434.784 ns
OldEnumerablePositionsIterationCacheWidthHeight
100
63,980.28 ns
207.250 ns
161.807 ns
PositionsToEnumerableIteration
100
61,171.52 ns
175.065 ns
163.756 ns
PositionsToEnumerableIterationShortcutMethod
100
70,518.70 ns
154.078 ns
128.662 ns
ManualPositionsIterationNormal
200
74,758.23 ns
809.573 ns
757.275 ns
ManualPositionsIterationCacheWidthHeight
200
18,763.24 ns
29.602 ns
27.689 ns
ManualPositionsIterationCountNormal
200
301,921.13 ns
1,477.911 ns
1,310.130 ns
ManualPositionsIterationCountCacheCountAndWidth
200
140,903.36 ns
634.363 ns
529.722 ns
PositionsIteration
200
60,929.27 ns
173.272 ns
162.079 ns
OldEnumerablePositionsIterationNormal
200
292,128.97 ns
1,353.992 ns
1,130.645 ns
OldEnumerablePositionsIterationCacheWidthHeight
200
254,856.63 ns
680.767 ns
568.471 ns
PositionsToEnumerableIteration
200
244,469.45 ns
947.353 ns
886.155 ns
PositionsToEnumerableIterationShortcutMethod
200
282,435.41 ns
990.563 ns
827.165 ns
Brief descriptions of and observations about the tests above:
PositionsIteration: Tests which use this PR's implementation of Positions(), which uses the custom enumerator.
ManualPositionsIteration*: Tests which use a set of for loops (or a single in the case of the Count tests instead of an enumerator of any sort. the Cache* tests use loops where the Width and Height are cached, instead of baked into the loop conditional (see source code for details).
OldEnumerablePositions*: Tests which use the old implementation of the Positions() functions, which uses yield return and IEnumerable<Point>. As before, the Cache versions simply cache width/height values as appropriate instead of calling on the property of the grid view in the loop condition.
PositionsToEnumerableIteration: Tests which use the custom enumerable's ToEnumerable function to work with an IEnumerable. This represents the new approach a user would take if they specifically need an IEnumerable for System.Linq or similar.
PositionsToEnumerableIterationShortcutMethod: Similar to PositionsToEnumerableIteration, but uses an alternate method which proves slower (for reference as to why the current implementation is the way it is).
Notably, the PositionsIteration test, eg. the one that uses this PR's custom enumerator, is faster than nearly everything else; includingManualPositionsIterationNormal, which I would argue represents the most common way a user would implement the loops manually. You can, of course, achieve superior performance with manual for loops by caching the width/height; but this is to be expected (no enumerator in C# has 0 overhead). For reference, the new Positions() implementation is about 6-8x faster than the old one in these test cases.
Breaking Changes
This PR minimizes breaking changes, but there are a few limited cases where some minor breaking can occur.
First, if a user has a custom implementation of ISettableGridView, which does not inherit from one of the provided base classes, and they also call Fill on that view without it being casted to ISettableGridView, then they will have to add that function to their implementation. This should be extremely straightforward, and should only apply in a very limited subset of cases.
Second, any use case of Positions() functions which depend on the result being IEnumerable<Point> will require slight modifications to account for this update. These use cases would include anything operating on the Positions() result using System.Linq, and perhaps some custom interfaces. myObj.Positions() must now be written as myObj.Positions().ToEnumerable() in these cases. This change should be straightforward as well, however, and simple foreach usages (arguably the most common) obviously will not need this change.
Although I typically don't like introducing these types of breaking changes (although very minor ones) within a minor version, these changes do provide a significant performance boost as outlined above, so I believe they should be worth it here given the minimal impact to the API.
This PR implements various performance optimizations.
Changes
ISettableGridView.Fill
extension method and replaced it with a native interface method with a default implementationISettableGridView.Clear
method to interface.Positions()
enumerator functions to return and use a custom iteratorToEnumerable()
function you can use to get anIEnumerable<Point>
for LINQ usage, etcNotes and Benchmarks
Grid View Fill and Clear
The
ISettableGridView.Fill
method currently in the integration library is convenient, but has a number of performance issues. First, it really just callsApplyOverlay
, which uses aFunc
to specify the value to set; this adds many function calls which slows the process down. Second, and more importantly, some grid view types likeArrayView<T>
andBitArrayView
can have their underlying data structures filled or cleared in a much more efficient way than simply iterating over all positions and assigning a value.The current implementation of
ISettableGridView.Fill
uses a special case forBitArray
to perform the operation faster forBitArrayView
; but does nothing forArrayView
. Furthermore, special-cases for various types is not a very flexible system for controlling this. Therefore, this PR removes the extension method, and adds it as a method ofISettableGridView
directly, so that implementers can provide their own implementations specific to their type.A default interface implementation is provided in order to minimize impact to the existing API. Additionally, the "base" classes for implementing your own grid views (
SettableGridViewBase
, etc) also provide implementations, so if custom classes inherit from these, there will be no additional work. This allowsBitArrayView
and array view variations to provide a more efficient implementation.Similarly, a
Clear
function has been added to theISettableGridView
interface, since clear can sometimes be implemented more efficiently thanFill
.The following performance tests outline the performance increases yielded as a result of this change on types which have the potential to implement fill/clear operations more efficiently:
Clear
andFill
represent this PR's implementation of those functions.OldFill
represents the old extension method.ManualFill
represents the result of using a set offor
loops to just set the new value to each location (the default implementation of the interface method).As evidence by the above tests, the current
Fill
andClear
implementations far out-perform the current implementations on these types.Note that the
OldFill
function has the same special case forBitArrayView
as the old implementation did; this is why the performance gains aren't as evident on this type.Positions Enumerables
Currently,
Rectangle
andIGridView
both provide aPositions
function of some sort to iterate over all positions in them. This is implemented by simply usingyield return
to create anIEnumerable<Point>
. However, since points are small value types, the overhead of enumerables/generators comes into play heavily here, particularly if the body of the loop is relatively fast.This PR, therefore, implements a custom enumerator for enumerating over the positions in a rectangular area, and uses that enumerator to implement the
Positions()
functions for bothRectangle
andIGridView
. The newPositions
function will work identically in a foreach loop, but it's a struct and therefore avoids much of the boxing and allocation that has to take place in the case of theIEnumerable
implementation.One result of this change is that you can no longer use
System.Linq
functions directly on the result of thePositions()
function. For this use case, the enumerator which is now returned by that function has aToEnumerator()
function; so where before you might domyGridView.Positions().Where(p => p.X == 10)
, you can now domyGridView.Positions().ToEnumerable().Where(p => p.X == 10)
.Also note that
ToEnumerable()
is implemented slightly differently (internally) than the previousPositions()
function was; the new way is slightly better performance wise.The following outlines the results of performance tests on this new implementation vs the old one:
Brief descriptions of and observations about the tests above:
Positions()
, which uses the custom enumerator.Count
tests instead of an enumerator of any sort. theCache*
tests use loops where theWidth
andHeight
are cached, instead of baked into the loop conditional (see source code for details).Positions()
functions, which usesyield return
andIEnumerable<Point>
. As before, theCache
versions simply cache width/height values as appropriate instead of calling on the property of the grid view in the loop condition.ToEnumerable
function to work with anIEnumerable
. This represents the new approach a user would take if they specifically need anIEnumerable
forSystem.Linq
or similar.PositionsToEnumerableIteration
, but uses an alternate method which proves slower (for reference as to why the current implementation is the way it is).Notably, the
PositionsIteration
test, eg. the one that uses this PR's custom enumerator, is faster than nearly everything else; includingManualPositionsIterationNormal
, which I would argue represents the most common way a user would implement the loops manually. You can, of course, achieve superior performance with manual for loops by caching the width/height; but this is to be expected (no enumerator in C# has 0 overhead). For reference, the newPositions()
implementation is about 6-8x faster than the old one in these test cases.Breaking Changes
This PR minimizes breaking changes, but there are a few limited cases where some minor breaking can occur.
First, if a user has a custom implementation of
ISettableGridView
, which does not inherit from one of the provided base classes, and they also callFill
on that view without it being casted toISettableGridView
, then they will have to add that function to their implementation. This should be extremely straightforward, and should only apply in a very limited subset of cases.Second, any use case of
Positions()
functions which depend on the result beingIEnumerable<Point>
will require slight modifications to account for this update. These use cases would include anything operating on thePositions()
result usingSystem.Linq
, and perhaps some custom interfaces.myObj.Positions()
must now be written asmyObj.Positions().ToEnumerable()
in these cases. This change should be straightforward as well, however, and simpleforeach
usages (arguably the most common) obviously will not need this change.Although I typically don't like introducing these types of breaking changes (although very minor ones) within a minor version, these changes do provide a significant performance boost as outlined above, so I believe they should be worth it here given the minimal impact to the API.