JakenVeina commented 6 months ago

Creating this as a PR just for the sake of getting everyone's eyes on it.

So, I did some research and realized that one of the things I was trying to do doesn't actually work.

I figured that the JIT would take a method like...

public void Foo<T, TItems>(TItems items)
    where TItems : IEnumerable<T>;

...and convert it to more-specific methods, depending on which different collection types are passed to it, E.G...

public void Foo<T>(IEnumerable<T> items);

public void Foo<T>(List<T> items);

public void Foo<T>(T[] items);

Thus, if the caller happens to have a List<T> the method they call gets benefit from List<T> optimizations in iterating and calling .Count. As it turns out, generic JIT only works this way when TItems is a value type. For reference types like IEnumerable<T>, List<T>, and T[], the JIT doesn't generate separate implementations, it just maps them all to the IEnumerable<T> implementation.

Additonally, I also need to retract a bit of advice I gave a few months ago, when I had run benchmarks for iterating over a List<T>, which I went ahead and re-created here...

List Enumeration Benchmark Results

Method	ItemCount	Mean	Error	StdDev	Ratio	RatioSD	Gen0	Allocated	Alloc Ratio
ItemsAsEnumerable	10000	18,866.35 ns	373.998 ns	415.698 ns	1.00	0.00	-	40 B	1.00
ItemsAsReadOnlyListWithForEach	10000	19,061.90 ns	314.996 ns	337.042 ns	1.01	0.03	-	40 B	1.00
ItemsAsReadOnlyListWithFor	10000	7,651.48 ns	147.581 ns	362.019 ns	0.42	0.02	-	-	0.00
ItemsAsReadOnlySpan	10000	3,564.88 ns	70.045 ns	111.099 ns	0.19	0.01	-	-	0.00
ItemsAsList	10000	6,209.85 ns	93.557 ns	87.513 ns	0.33	0.01	-	-	0.00

At the time, I had noticed that iterating over an IReadOnlyList<T> with just a plain for instead of a foreach was quite a bit faster. What I have since discovered is that this is only true if the IReadOnlyList<T> is actually a List<T> under the hood. If it's an Array<T> under the hood, using a for is actually worse.

Array Enumeration Benchmark Results

Method	ItemCount	Mean	Error	StdDev	Ratio	RatioSD	Gen0	Allocated	Alloc Ratio
ItemsAsEnumerable	10000	14,417.04 ns	87.993 ns	73.478 ns	1.00	0.00	-	32 B	1.00
ItemsAsReadOnlyListWithForEach	10000	14,488.32 ns	174.939 ns	163.638 ns	1.01	0.01	-	32 B	1.00
ItemsAsReadOnlyListWithFor	10000	43,024.56 ns	245.363 ns	229.513 ns	2.98	0.02	-	-	0.00
ItemsAsReadOnlySpan	10000	3,195.86 ns	30.429 ns	28.463 ns	0.22	0.00	-	-	0.00
ItemsAsArray	10000	3,214.86 ns	31.132 ns	27.598 ns	0.22	0.00	-	-	0.00

I also went ahead and ran one for Dictionary<TKey, TValue>

Dictionary Enumeration Benchmark Results

Method	ItemCount	Mean	Error	StdDev	Ratio	RatioSD	Gen0	Allocated	Alloc Ratio
ItemsAsEnumerable	10000	26,079.1 ns	509.63 ns	714.43 ns	1.00	0.00	-	48 B	1.00
ItemsAsReadOnlyDictionaryOverItems	10000	26,097.9 ns	512.02 ns	569.11 ns	1.00	0.04	-	48 B	1.00
ItemsAsReadOnlyDictionaryOverKeys	10000	78,424.6 ns	1,109.81 ns	983.82 ns	2.98	0.10	-	40 B	0.83
ItemsAsReadOnlyDictionaryOverValues	10000	20,017.6 ns	360.47 ns	337.18 ns	0.76	0.03	-	40 B	0.83
ItemsAsDictionaryOverItems	10000	13,011.0 ns	246.49 ns	205.83 ns	0.49	0.02	-	-	0.00
ItemsAsDictionaryOverKeys	10000	64,904.2 ns	1,296.30 ns	1,440.84 ns	2.48	0.09	-	-	0.00
ItemsAsDictionaryOverValues	10000	13,194.4 ns	261.67 ns	340.25 ns	0.51	0.02	-	-	0.00

Conclusion

So given all of the above, I think the general advice is as follows:

Only declare the minimum dependency you need, to provide an API that is the most "open for extension" as possible. For sets, that's gonna be IEnumerable<T>. If you NEED .Count or this[], declare your minimum dependency as IReadOnlyList<T> or IReadOnlyDictionary<TKey, TValue>, depending. Don't request them just for the sake of better performance, because we don't actually GET reliably better performance they all perform basically the same as IEnumerable<T> when it comes to iterating.
Always use foreach over for when possible. Using for doesn't guarantee better performance, even accounting for the IEnumerator<T> allocation.
Add overloads for ReadOnlySpan<T> whenever it makes sense, because that one DEFINITELY provides better performance, AND they can be created in many different ways, AND they can be sliced on the consumer's side, so they're much more flexible to work with.
We COULD add overloads for List<T>, Array<T>, and Dictionary<TKey, TValue>, for performance, but I don't think we should. I don't think the performance gains are worth losing the API clarity that we are not mutating these collections, only reading from them. In the event that callers are concerned with performance at this level, they should really be using the ReadOnlySpan<T> overloads.

RolandPheasant commented 6 months ago

We COULD add overloads for List, Array, and Dictionary<TKey, TValue>, for performance, but I don't think we should. I don't think the performance gains are worth losing the API clarity that we are not mutating these collections, only reading from them. In the event that callers are concerned with performance at this level, they should really be using the ReadOnlySpan overloads.

I am curious to see how all of this will be consumed. The reason I introduced ChangeAwareCache and ChangeAwareList was to simplify changeset construction, which in turn made it much simpler to provide:

Consistency for the various operators
Optimization can be applied across the system
It's super tedious to implement the operator and construct the change sets in the same code i.e the logic becomes very complicated and even worse to maintain.
In the case of ChangeAwareList, I'd go as far as to say I would have withdrawn the list stuff without the optimizations it provides (that's mostly another topic, but making list construction optimal is vital if we are to support it in vNext).

Also, for nearly all constructed change sets in the real world we construct them internally and we can optimize as we see fit. Whereas for the occasional user based custom operator the performance impact of how the changeset is constructed would probably be minimal.

JakenVeina commented 6 months ago

Equivalents for ChangeAwareCache and ChangeAwareList were next on my todo list for this project, as well. It may be that these convenience methods don't get much use internally, so the question then remains whether they make sense to keep for public use. That question basically amounts to "how useful is it to be able to create ChangeSets without having to populate a collection first?" This seems like a pretty-darn-rare scenario, but I can think of one concrete example: the .NET FileSystemWatcher API, which exposes several different plain events for changes occurring on the file system, .Changed, .Created, .Deleted, .Renamed. Each of these could be translated directly into an equivalent ChangeSet without having to consume the extra memory of pumping them into a collection first.

I think it would be a mistake to prevent consumers from creating ChangeSets for themselves, if they want to, but that doesn't mean we need to provide convenience methods for it. We DO, I think, definitely need the construction methods upon the Change types, as those are part of guaranteeing correctness. Although, now that I think about it, maybe the same argument applies to ChangeSet types: as it is now, someone could create a ChangeSet of type Clear that contains changes other than just Removes. If we lock down the ChangeSet types the way Change types are, we could prevent that.

github-actions[bot] commented 5 months ago

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

reactivemarbles / DynamicDataVNext

Reworked ChangeSet factory methods, after more research #4

List Enumeration Benchmark Results

Array Enumeration Benchmark Results

Dictionary Enumeration Benchmark Results

Conclusion