Closed JakenVeina closed 5 months ago
We COULD add overloads for List
, Array , and Dictionary<TKey, TValue>, for performance, but I don't think we should. I don't think the performance gains are worth losing the API clarity that we are not mutating these collections, only reading from them. In the event that callers are concerned with performance at this level, they should really be using the ReadOnlySpan overloads.
I am curious to see how all of this will be consumed. The reason I introduced ChangeAwareCache and ChangeAwareList was to simplify changeset construction, which in turn made it much simpler to provide:
Also, for nearly all constructed change sets in the real world we construct them internally and we can optimize as we see fit. Whereas for the occasional user based custom operator the performance impact of how the changeset is constructed would probably be minimal.
Equivalents for ChangeAwareCache
and ChangeAwareList
were next on my todo list for this project, as well. It may be that these convenience methods don't get much use internally, so the question then remains whether they make sense to keep for public use. That question basically amounts to "how useful is it to be able to create ChangeSet
s without having to populate a collection first?" This seems like a pretty-darn-rare scenario, but I can think of one concrete example: the .NET FileSystemWatcher
API, which exposes several different plain events for changes occurring on the file system, .Changed
, .Created
, .Deleted
, .Renamed
. Each of these could be translated directly into an equivalent ChangeSet
without having to consume the extra memory of pumping them into a collection first.
I think it would be a mistake to prevent consumers from creating ChangeSet
s for themselves, if they want to, but that doesn't mean we need to provide convenience methods for it. We DO, I think, definitely need the construction methods upon the Change
types, as those are part of guaranteeing correctness. Although, now that I think about it, maybe the same argument applies to ChangeSet
types: as it is now, someone could create a ChangeSet
of type Clear
that contains changes other than just Remove
s. If we lock down the ChangeSet
types the way Change
types are, we could prevent that.
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Creating this as a PR just for the sake of getting everyone's eyes on it.
So, I did some research and realized that one of the things I was trying to do doesn't actually work.
I figured that the JIT would take a method like...
...and convert it to more-specific methods, depending on which different collection types are passed to it, E.G...
Thus, if the caller happens to have a
List<T>
the method they call gets benefit fromList<T>
optimizations in iterating and calling.Count
. As it turns out, generic JIT only works this way whenTItems
is a value type. For reference types likeIEnumerable<T>
,List<T>
, andT[]
, the JIT doesn't generate separate implementations, it just maps them all to theIEnumerable<T>
implementation.Additonally, I also need to retract a bit of advice I gave a few months ago, when I had run benchmarks for iterating over a
List<T>
, which I went ahead and re-created here...List Enumeration Benchmark Results
At the time, I had noticed that iterating over an
IReadOnlyList<T>
with just a plainfor
instead of aforeach
was quite a bit faster. What I have since discovered is that this is only true if theIReadOnlyList<T>
is actually aList<T>
under the hood. If it's anArray<T>
under the hood, using afor
is actually worse.Array Enumeration Benchmark Results
I also went ahead and ran one for
Dictionary<TKey, TValue>
Dictionary Enumeration Benchmark Results
Conclusion
So given all of the above, I think the general advice is as follows:
IEnumerable<T>
. If you NEED.Count
orthis[]
, declare your minimum dependency asIReadOnlyList<T>
orIReadOnlyDictionary<TKey, TValue>
, depending. Don't request them just for the sake of better performance, because we don't actually GET reliably better performance they all perform basically the same asIEnumerable<T>
when it comes to iterating.foreach
overfor
when possible. Usingfor
doesn't guarantee better performance, even accounting for theIEnumerator<T>
allocation.ReadOnlySpan<T>
whenever it makes sense, because that one DEFINITELY provides better performance, AND they can be created in many different ways, AND they can be sliced on the consumer's side, so they're much more flexible to work with.List<T>
,Array<T>
, andDictionary<TKey, TValue>
, for performance, but I don't think we should. I don't think the performance gains are worth losing the API clarity that we are not mutating these collections, only reading from them. In the event that callers are concerned with performance at this level, they should really be using theReadOnlySpan<T>
overloads.