Open cmeyertons opened 4 years ago
Tagging subscribers to this area: @eiriktsarpalis, @jeffhandley See info in area-owners.md if you want to be subscribed.
Because
ICollection<T>
implementsIReadonlyCollection<T>
It doesn't.
If I recall correctly such suggestions to add (not to replace ICollection
) fast paths for IReadOnlyCollection
here and there were rejected several times because such casts to covariant interfaces were super slow, however these performance issues were fixed as far as I know (cast caches, inlined checks) so maybe it worth checking if we can add them in some places?
@davidwrighton are these kinds of "optimistic checks for interfaces" indeed much cheaper now than in the past?
cc @VSadov
Because
ICollection<T>
implementsIReadonlyCollection<T>
It doesn't.
As I was! Egg on my face for sure. Apologies, i thought this would be a drop-in request. Thanks for the quick replies
@danmosemsft a quick benchmark:
static IEnumerable<string> strings = new List<string>();
[Benchmark]
public bool IsCollection() => strings is ICollection<string>;
[Benchmark]
public bool IsReadOnlyList() => strings is IReadOnlyCollection<string>;
.NET Core 2.2:
| Method | Mean | Error | StdDev |
|---------------- |----------:|----------:|----------:|
| IsCollection | 2.637 ns | 0.0038 ns | 0.0035 ns |
| IsReadOnlyList | 41.492 ns | 0.0911 ns | 0.0808 ns |
.NET Core 3.0:
| Method | Mean | Error | StdDev |
|---------------- |----------:|----------:|----------:|
| IsCollection | 1.069 ns | 0.0008 ns | 0.0007 ns |
| IsReadOnlyList | 40.578 ns | 0.0316 ns | 0.0264 ns |
.NET 5.0:
| Method | Mean | Error | StdDev |
|---------------- |----------:|----------:|----------:|
| IsCollection | 1.121 ns | 0.0010 ns | 0.0009 ns |
| IsReadOnlyList | 2.976 ns | 0.0094 ns | 0.0088 ns |
Related PR: https://github.com/dotnet/coreclr/pull/23548
Covariant interfaces are not super slow now.
Cost can vary for both regular interface casts and for fancy ones. Regular interface cast is a linear search, but typically does not need to search far. Cached cast may need to deal with hash collisions, but typically just gets a cached value.
As a veeery rough estimate a fancy cast can be counted as a 2X of a regular interface cast.
In the past the cost of complicated casts was technically unbounded. As you nest variant generics, the cost would go up and considerably. Thus they were avoided by library owners.
Thanks @EgorBo that is indeed much faster.
It is definitely faster than before, but there is still a non-zero cost to performance when making the suggested change.
I have a few possible concerns here.
ICollection<T>
and not IReadOnlyCollection<T>
?ICollection<T>
and IReadOnlyCollection<T>
how much of a penalty does making the LINQ functions larger have?IReadOnlyCollection<T>
in such a way that it does not match with the behavior of IEnumerable<T>
? My guess is that we would not treat such scenarios specially, but it is a real possibility that customers with custom written collections may have incorrect implementations of code that hasn't been tested.It is definitely faster than before, but there is still a non-zero cost to performance when making the suggested change.
I have a few possible concerns here.
- What about customers that only implement
ICollection<T>
and notIReadOnlyCollection<T>
?- If we mitigate concern #1 by having checks for both
ICollection<T>
andIReadOnlyCollection<T>
how much of a penalty does making the LINQ functions larger have?- Do we have any concerns around customers who may have implemented
IReadOnlyCollection<T>
in such a way that it does not match with the behavior ofIEnumerable<T>
? My guess is that we would not treat such scenarios specially, but it is a real possibility that customers with custom written collections may have incorrect implementations of code that hasn't been tested.- As @VSadov notes, the performance impact is now much less severe, but its not nothing.
A good example is Linq's Count: https://github.com/dotnet/runtime/blob/master/src/libraries/System.Linq/src/System/Linq/Count.cs#L11-L46
IReadOnlyCollection<T>
check used to apply a penalty for the O(N) foreach-based fallback (note other fast paths), but now that penalty ~15 times smaller.
But of course it depends on how often users have IROCIReadOnlyCollection<T>
input leads to O(N) loop which can be quite slow for large collections.
There were also perf issues with limited numbers of "fast" dictionary slots (see #11971) that should now be (largely) mitigated by the dynamic dictionary expansion added in 5.0.
I found quite a few complains or rejected attempts to optimize LINQ for IReadOnlyCollection<T>
:
https://github.com/dotnet/runtime/issues/28651 - LINQ results implicit support for IReadOnlyCollectionIReadOnlyCollection<T>
- What about customers that only implement
ICollection<T>
and notIReadOnlyCollection<T>
?
This can be expanded to different scenarios:
ICollection<T>
, but exposed as IReadOnlyCollection<T>
in public surface.
This should be very common. Linq will not be impact here.IReadOnlyCollection<T>
.
This depends on the actual implementation type:
ReadOnlyCollection<T>
(including ReadOnlyObservableCollection<T>
): while it's designed to be read-only, it still implement the non-readonly interfaces. Not the case.ImmutableArray<T>
: it has it's own extension methods of linq to avoid boxing. Won't worry about the default linq implementation.ICollection<TDerived>
, but exposed as IReadOnlyCollection<TBase>
, and gets linq called with <TBase>
.
This should be the scenario that's most probably get performance impact. Covariant interface check are slower, but it powers this scenario.There are a few other interfaces used to determine IEnumerable counts, also not in a subtype relationship with either ICollection<T>
or IReadOnlyCollection<T>
. Would it make sense to include those as well?
Next step should be to enumerate a list of methods that could benefit from specialization.
Tagging subscribers to this area: @eiriktsarpalis, @jeffhandley See info in area-owners.md if you want to be subscribed.
Related: #23337.
Hello everyone! I'm not sure if someone else thought about this before, but I just had an idea that could solve this problem. Why not introduce a new, non-generic interface to the BCL named "ICountable", with nothing more than a "Count" property? Then just make ICollection, ICollection\<T> and IReadOnlyCollection\<T> all implement this interface. That would easily solve the problem with covariant casts, since we don't even have type parameters anymore. And we could even simplify all the code paths with just a test for "ICountable".
Why not introduce a new, non-generic interface
Adding more interfaces can make things a mess and worse. Not all classes will implement the new interface, so an additional interface check may be required.
Is there any progress on the issue?
Even with .NET 6.0, Linq functions which should be able to take advantage of indexed random-access collections, such as Skip(int)
, don't seem to be able to handle custom read-only collections that for example implement IReadOnlyList<T> but not IList<T> (...and why should they?) without unnecessarily poor performance.
No, there hasn't been any progress. The general conclusion is that we can't add new interface checks here without changing the interface checking mechanism. We've been kicking around the idea of an optimized type switch operation for a few years, but it would almost certainly make the most common case a little bit slower in exchange for allowing more scenarios to have roughly equivalent performance. However, we haven't built out that low level feature enough to see the practical impact on changing the common patterns in the Linq codebase.
Which is why we intentionally skipped checks for IReadOnlyCollection<T>
in the new TryGetNonEnumeratedCount
method (see #54764).
One possible alternative avenue to explore is introducing a common base interface for exposing the count, which should be possible using DIMs. Here's a sketch of that idea. We've generally resisted retrofitting old interfaces with DIMs so far though, since they can be susceptible to both source and runtime breaking changes.
@elgonzo consider this: ICollection<TSource>.IsReadOnly
. Why not simply implement ICollection<TSource>
?
custom read-only collections that for example implement IReadOnlyList
but not IList (...and why should they?)
P.S. I must admit this.
I'm joining the discussion regarding support for IReadOnlyCollection
in TryGetNonEnumeratedCount
. I favor adding support.
I came here after first writing my own version of this method. Then, IntelliSense alerted me to your version. The method names were similar.
At first, I was confused. Your code is similar to mine. Then, I was excited. I thought I could delete mine. Then, I was disappointed. I discovered that IReadOnlyCollection
is not supported. It was an emotional rollercoaster 😄
While I remain in favor of adding support, I respect the opposing opinion. If the decision remains not to support it, I'd like to offer some alternate suggestions:
IReadOnlyCollection
is not supported. I expected its support, based on the current comments/documentation.includeReadOnly
parameter. This should not be a breaking change. It would allow consumers (other than LINQ) to choose the behavior they prefer.I understand, given its history, that its natural to consider this purely from the perspective of how the method fits into the larger LINQ ecosystem. However, not every use-case is LINQ-related. Mine was not.
That said, let me be clear, I am a huge fan of LINQ. It has probably saved me thousands of hours of coding over the years!
Thank you for your time and consideration.
This was done in https://github.com/dotnet/runtime/pull/101469 and then reverted in https://github.com/dotnet/runtime/pull/101644 as part of undoing the change to have the mutable collection interfaces inherit the readonly ones. When that revert is itself reverted, hopefully in .NET 10, this should naturally be addressed.
@stephentoub this is very exciting news. Thank you for sharing! I hope the reversion does indeed get reverted 😄
The whole IReadOnlyCollection<T>
versus ICollection<T>
issue has been a real pain for quite some time!
However, it might be worth updating the comments/documentation until this (hopefully?) becomes a reality. I assume .NET 10 is at least a year away.
In the meantime, I wrote my own overload method/extension with a supportReadOnly
parameter. It invokes your version, which was superior to mine. I forgot about IIListProvider<T>
. Also, I (shamefully) missed the non-generic ICollection
.
For reference, in case it helps anyone else, here's my new method:
public static bool TryGetNonEnumeratedCount<T>(this IEnumerable<T> source,
out int count, bool supportReadOnly)
{
if (source.TryGetNonEnumeratedCount(out count))
return true;
if (!supportReadOnly || source is not IReadOnlyCollection<T> readable)
return false;
count = readable.Count;
return true;
}
There are many places in the Linq / Collection code that leverage detecting if an
IEnumerable<T>
is anICollection<T>
to perform optimizations (e.g. presizing a new array, etc.)List.cs
Because
ICollection<T>
implementsIReadonlyCollection<T>
,IReadonlyCollection<T>
should be exclusively used in these scenarios to support customIReadonlyCollection<T>
implementations that don't necessary want to exposeAdd(T item)
Currently, collection authors have to implement ICollection to take advantage of the performance gains and leave Add throwing
NotImplementedException
to convey proper usage.