dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.54k stars 4.54k forks source link

Make mutable generic collection interfaces implement read-only collection interfaces #31001

Open TylerBrinkley opened 4 years ago

TylerBrinkley commented 4 years ago

Rationale

It's long been a source of confusion that the mutable generic collection interfaces don't implement their respective read-only collection interfaces. This was of course due to the read-only collection interfaces being added after the fact and thus would cause breaking changes by changing a published interface API.

With the addition of default interface implementations in C#8/.NET Core 3.0 I think the mutable generic collection interfaces, ICollection<T>, IList<T>, IDictionary<K, V>, and ISet<T> should now implicitly inherit their respective read-only collection interfaces. This can now be done without causing breaking changes.

While it would have been nice for these interfaces to share members, I think the proposed API below is the best we can possibly do with the read-only interfaces being added after the fact.

As an added bonus, this should allow some simplification of the type checking in LINQ code to check for the read-only interfaces instead of the mutable interfaces.

Proposed API

 namespace System.Collections.Generic {
-    public interface ICollection<T> : IEnumerable<T> {
+    public interface ICollection<T> : IReadOnlyCollection<T> {
-        int Count { get; }
+        new int Count { get; }
+        int IReadOnlyCollection<T>.Count => Count;
     }
-    public interface IList<T> : ICollection<T> {
+    public interface IList<T> : ICollection<T>, IReadOnlyList<T> {
-        T this[int index] { get; set; }
+        new T this[int index] { get; set; }
+        T IReadOnlyList<T>.this[int index] => this[index];
     }
-    public interface IDictionary<TKey, TValue> : ICollection<KeyValuePair<TKey, TValue>> {
+    public interface IDictionary<TKey, TValue> : ICollection<KeyValuePair<TKey, TValue>>, IReadOnlyDictionary<TKey, TValue> {
-        TValue this[TKey key] { get; set; }
+        new TValue this[TKey key] { get; set; }
-        ICollection<TKey> Keys { get; }
+        new ICollection<TKey> Keys { get; }
-        ICollection<TValue> Values { get; }
+        new ICollection<TValue> Values { get; }
-        bool ContainsKey(TKey key);
+        new bool ContainsKey(TKey key);
-        bool TryGetValue(TKey key, out TValue value);
+        new bool TryGetValue(TKey key, out TValue value);
+        TValue IReadOnlyDictionary<TKey, TValue>.this[TKey key] => this[key];
+        IEnumerable<TKey> IReadOnlyDictionary<TKey, TValue>.Keys => Keys;
+        IEnumerable<TValue> IReadOnlyDictionary<TKey, TValue>.Values => Values;
+        bool IReadOnlyDictionary<TKey, TValue>.ContainsKey(TKey key) => ContainsKey(key);
+        bool IReadOnlyDictionary<TKey, TValue>.TryGetValue(TKey key, out TValue value) => TryGetValue(key, out value);
     }
-    public interface ISet<T> : ICollection<T> {
+    public interface ISet<T> : ICollection<T>, IReadOnlySet<T> {
-        bool IsProperSubsetOf(IEnumerable<T> other);
+        new bool IsProperSubsetOf(IEnumerable<T> other);
-        bool IsProperSupersetOf(IEnumerable<T> other);
+        new bool IsProperSupersetOf(IEnumerable<T> other);
-        bool IsSubsetOf(IEnumerable<T> other);
+        new bool IsSubsetOf(IEnumerable<T> other);
-        bool IsSupersetOf(IEnumerable<T> other);
+        new bool IsSupersetOf(IEnumerable<T> other);
-        bool Overlaps(IEnumerable<T> other);
+        new bool Overlaps(IEnumerable<T> other);
-        bool SetEquals(IEnumerable<T> other);
+        new bool SetEquals(IEnumerable<T> other);
// Adding this new member is required so that there's a most specific Contains method on ISet<T> since ICollection<T> and IReadOnlySet<T> define it too
+        new bool Contains(T value) => ((ICollection<T>)this).Contains(value); 
+        bool IReadOnlySet<T>.Contains(T value) => ((ICollection<T>)this).Contains(value);
+        bool IReadOnlySet<T>.IsProperSubsetOf(IEnumerable<T> other) => IsProperSubsetOf(other);
+        bool IReadOnlySet<T>.IsProperSupersetOf(IEnumerable<T> other) => IsProperSupersetOf(other);
+        bool IReadOnlySet<T>.IsSubsetOf(IEnumerable<T> other) => IsSubsetOf(other);
+        bool IReadOnlySet<T>.IsSupersetOf(IEnumerable<T> other) => IsSupersetOf(other);
+        bool IReadOnlySet<T>.Overlaps(IEnumerable<T> other) => Overlaps(other);
+        bool IReadOnlySet<T>.SetEquals(IEnumerable<T> other) => SetEquals(other);
+    }
 }

Binary Compatibility Test

I was able to test that this change doesn't break existing implementers with the following custom interfaces and by simply dropping the new interfaces dll to the publish folder without recompiling the consuming code, the IMyReadOnlyList<T> interface was automatically supported without breaking the code.

Original Interfaces DLL code

namespace InterfaceTest
{
    public interface IMyReadOnlyList<T>
    {
        int Count { get; }
        T this[int index] { get; }
    }

    public interface IMyList<T>
    {
        int Count { get; }
        T this[int index] { get; set; }
    }
}

New Interfaces DLL code

namespace InterfaceTest
{
    public interface IMyReadOnlyList<T>
    {
        int Count { get; }
        T this[int index] { get; }
    }

    public interface IMyList<T> : IMyReadOnlyList<T>
    {
        new int Count { get; }
        new T this[int index] { get; set; }
        int IMyReadOnlyList<T>.Count => Count;
        T IMyReadOnlyList<T>.this[int index] => this[index];
    }
}

Consuming Code

using System;
using System.Collections.Generic;

namespace InterfaceTest
{
    class Program
    {
        static void Main()
        {
            var myList = new MyList<int>();
            Console.WriteLine($"MyList<int>.Count: {myList.Count}");
            Console.WriteLine($"IMyList<int>.Count: {((IMyList<int>)myList).Count}");
            Console.WriteLine($"IMyReadOnlyList<int>.Count: {(myList as IMyReadOnlyList<int>)?.Count}");
            Console.WriteLine($"MyList<int>[1]: {myList[1]}");
            Console.WriteLine($"IMyList<int>[1]: {((IMyList<int>)myList)[1]}");
            Console.WriteLine($"IMyReadOnlyList<int>[1]: {(myList as IMyReadOnlyList<int>)?[1]}");
        }
    }

    public class MyList<T> : IMyList<T>
    {
        private readonly List<T> _list = new List<T> { default, default };

        public T this[int index] { get => _list[index]; set => _list[index] = value; }

        public int Count => _list.Count;
    }
}

Original Output

MyList<int>.Count: 2
IMyList<int>.Count: 2
IMyReadOnlyList<int>.Count:
MyList<int>[1]: 0
IMyList<int>[1]: 0
IMyReadOnlyList<int>[1]:

New Output

MyList<int>.Count: 2
IMyList<int>.Count: 2
IMyReadOnlyList<int>.Count: 2
MyList<int>[1]: 0
IMyList<int>[1]: 0
IMyReadOnlyList<int>[1]: 0

Moved from dotnet/runtime#16151

Updates

MichalStrehovsky commented 7 months ago

I have created a repo trying to replicate the shared diamond problem and it doesn't seem to exist:

https://github.com/terrajobst/ReadOnlyCollectionInterfaces

I suggest we put this work on early for .NET 9.0 and see whether anything shows up. So far, it looks promising.

I've submitted two PRs to the repo that show two different failure modes - can be useful for discussions around the extent of this breaking change. If someone has other ideas for how this could break, it would probably be good to collect everything in the form of PRs to the repo.

This could be used to e.g. scope out how many things would be broken by scanning NuGet for these patterns. Ultimately we'd learn the most by trying this very early in .NET 9 so that we have runway to roll back if needed.

This would be better to land at or before https://github.com/dotnet/csharplang/blob/main/proposals/ref-struct-interfaces.md language feature lands because at that point adding instance default interface methods to existing interfaces becomes universally source+binary breaking (search for "default interface methods" in that doc).

jhudsoncedaron commented 7 months ago

Higher up in the thread somebody posted how to patch the loader to get it to auto-resolve those diamonds with a hinting attribute. So you can in fact reduce the breakage points to zero.

bartonjs commented 7 months ago

Video

It sounds like we're ready to try the experiment. To everyone who is celebrating right now, remember: if it goes poorly we're going to back it out :smile:.

 namespace System.Collections.Generic {
-    public interface ICollection<T> : IEnumerable<T> {
+    public interface ICollection<T> : IReadOnlyCollection<T> {
-        int Count { get; }
+        new int Count { get; }
+        int IReadOnlyCollection<T>.Count => Count;
     }
-    public interface IList<T> : ICollection<T> {
+    public interface IList<T> : ICollection<T>, IReadOnlyList<T> {
-        T this[int index] { get; set; }
+        new T this[int index] { get; set; }
+        T IReadOnlyList<T>.this[int index] => this[index];
     }
-    public interface IDictionary<TKey, TValue> : ICollection<KeyValuePair<TKey, TValue>> {
+    public interface IDictionary<TKey, TValue> : ICollection<KeyValuePair<TKey, TValue>>, IReadOnlyDictionary<TKey, TValue> {
-        TValue this[TKey key] { get; set; }
+        new TValue this[TKey key] { get; set; }
-        ICollection<TKey> Keys { get; }
+        new ICollection<TKey> Keys { get; }
-        ICollection<TValue> Values { get; }
+        new ICollection<TValue> Values { get; }
-        bool ContainsKey(TKey key);
+        new bool ContainsKey(TKey key);
-        bool TryGetValue(TKey key, out TValue value);
+        new bool TryGetValue(TKey key, out TValue value);
+        TValue IReadOnlyDictionary<TKey, TValue>.this[TKey key] => this[key];
+        IEnumerable<TKey> IReadOnlyDictionary<TKey, TValue>.Keys => Keys;
+        IEnumerable<TValue> IReadOnlyDictionary<TKey, TValue>.Values => Values;
+        bool IReadOnlyDictionary<TKey, TValue>.ContainsKey(TKey key) => ContainsKey(key);
+        bool IReadOnlyDictionary<TKey, TValue>.TryGetValue(TKey key, out TValue value) => TryGetValue(key, out value);
     }
-    public interface ISet<T> : ICollection<T> {
+    public interface ISet<T> : ICollection<T>, IReadOnlySet<T> {
-        bool IsProperSubsetOf(IEnumerable<T> other);
+        new bool IsProperSubsetOf(IEnumerable<T> other);
-        bool IsProperSupersetOf(IEnumerable<T> other);
+        new bool IsProperSupersetOf(IEnumerable<T> other);
-        bool IsSubsetOf(IEnumerable<T> other);
+        new bool IsSubsetOf(IEnumerable<T> other);
-        bool IsSupersetOf(IEnumerable<T> other);
+        new bool IsSupersetOf(IEnumerable<T> other);
-        bool Overlaps(IEnumerable<T> other);
+        new bool Overlaps(IEnumerable<T> other);
-        bool SetEquals(IEnumerable<T> other);
+        new bool SetEquals(IEnumerable<T> other);
// Adding this new member is required so that there's a most specific Contains method on ISet<T> since ICollection<T> and IReadOnlySet<T> define it too
+        new bool Contains(T value) => ((ICollection<T>)this).Contains(value); 
+        bool IReadOnlySet<T>.Contains(T value) => ((ICollection<T>)this).Contains(value);
+        bool IReadOnlySet<T>.IsProperSubsetOf(IEnumerable<T> other) => IsProperSubsetOf(other);
+        bool IReadOnlySet<T>.IsProperSupersetOf(IEnumerable<T> other) => IsProperSupersetOf(other);
+        bool IReadOnlySet<T>.IsSubsetOf(IEnumerable<T> other) => IsSubsetOf(other);
+        bool IReadOnlySet<T>.IsSupersetOf(IEnumerable<T> other) => IsSupersetOf(other);
+        bool IReadOnlySet<T>.Overlaps(IEnumerable<T> other) => Overlaps(other);
+        bool IReadOnlySet<T>.SetEquals(IEnumerable<T> other) => SetEquals(other);
+    }
 }
MichalStrehovsky commented 7 months ago

Higher up in the thread somebody posted how to patch the loader to get it to auto-resolve those diamonds with a hinting attribute. So you can in fact reduce the breakage points to zero.

That's not true: https://github.com/terrajobst/ReadOnlyCollectionInterfaces/pull/1 doesn't involve conflict in default implementations. There's no diamond and even no default method in user code. The proposed throwaway loader feature would not fix that.

(I'm calling the feature throwaway because it can only be used if there's no other code using the new attribute marking things as fallback default implementations: we can only use it for one release - once this "fallback implementation" feature ships, we'll need "fallback of a fallback" attribute to use it again because now one can have fallbacks in user code. The feature is the type loader equivalent of topmost window problem)

TylerBrinkley commented 7 months ago

I'm not saying I'm advocating for this hinting attribute but couldn't you specify a priority property on the attribute to solve that issue and then reserve the highest priority for framework usage only?

terrajobst commented 7 months ago

Thanks to @MichalStrehovsky to submit PRs that illustrate specific failure conditions. That helps a great deal!

However, I don't think the two failure modes that @MichalStrehovsky pointed out change my rationale on why I think we should consider this change:

  1. Binding to existing virtual members. This is illustrated by https://github.com/terrajobst/ReadOnlyCollectionInterfaces/pull/1.
  2. Ambiguous implementation when combined with user-defined DIMs. This is illustrated by https://github.com/terrajobst/ReadOnlyCollectionInterfaces/pull/2.

The second one feels very rare because DIMs in user-defined types in the larger library ecosystem feel rare. I don't have data on this yet, but I plan to crawl nuget.org and report back how many there are. It's not gonna be zero, but it's also not gonna rise to 10-30% either. I suspect the use is less than 1%.

The first one also feels rare: How often will people have signatures from members in the IReadOnlyXxx interfaces that are virtual but unrelated to the semantics of the collection type? It is of course possible to construct them, but it doesn't seem very likely.

This is the totality of all members defined by IReadOnlyXxx interfaces that would have that problem:

int Count { get; }
T this[int index] { get; }
TValue this[TKey key] { get; }
IEnumerable<TKey> Keys { get; }
IEnumerable<TValue> Values { get; }

bool ContainsKey(TKey key);
bool TryGetValue(TKey key, [MaybeNullWhen(false)] out TValue value);

I would think if a type extends any of the mutable collection interfaces and defines these members as virtual members than their semantics would likely match those of the underlying IReadOnlyXxx interfaces.

Now, of course I can't conclusively prove any of this. But I think my reasoning is plausible enough to give this experiment a try early in .NET 9 and see what (if anything) breaks as a result of this.

bartonjs commented 7 months ago

How often will people have signatures from members in the IReadOnlyXxx interfaces that are virtual but unrelated to the semantics of the collection type?

Presumably that's analyzable. At least in the linked case, since it's a) it has explicitly implemented ICollection.Count, and b) has a (different) public member named Count. In that case, IReadOnlyCollection.Count will be considered implicitly implemented, and not lifted from the DIM on ICollection.

terrajobst commented 7 months ago

@bartonjs

Presumably that's analyzable.

Maybe but I'll struggle with writing down what I would be looking for. There are many ways one could share the logic/semantics without necessarily binding to the same method across all v-table slots.

joshudson commented 7 months ago

Incidentally it's also fixable. Define a new attribute for default interface mehods that means "do not look for public methods to replace this with, but only methods declared to implement the interface.

I can tell that random public methods of the same name as interface methods don't necessarily impute themselves onto newly added interface methods because the opposite behavior exists in VB.NET; adding a (non-default) interface method results in a load error in VB.NET until the method is implemented in the class implementing the interface.

MichalStrehovsky commented 7 months ago

Maybe but I'll struggle with writing down what I would be looking for. There are many ways one could share the logic/semantics without necessarily binding to the same method across all v-table slots.

The low tech approach could be just about looking at the list of methods. This would have to be explicit interface method implementation and the C# compiler mangles them in a distinct way.

If there's a method with name System.Collections.Generic.ICollection\<.*\>.get_Count in the base hierarchy, and another method with name get_Count (without the prefix) that returns an int, the class is likely susceptible to this problem. One could restrict to just virtual method (if get_Count is not virtual, it would not be a binary breaking change, but it would still be a source breaking change because the C# compiler is going to "helpfully" mark the method virtual if it sees it could implement an interface).

Incidentally it's also fixable. Define a new attribute for default interface mehods that means "do not look for public methods to replace this with, but only methods declared to implement the interface.

If I understand it right, the fix would cause class Foo : ICollection<T> { public int Count { ... } } to not have IReadOnlyCollection.Count implemented by Foo, but by the default method on ICollection even in the future when we already shipped ICollection : IReadOnlyCollection. That sounds like a worse problem than the original problem. The default implementations are sub-optimal. They cause double interface call in spots that could just be one interface call. They also create an interface dispatch hotspot in the default implementation. Dynamic PGO is going to have a difficult time getting rid of them (because the callsite will likely be superpolymorphic). We want to run the convenience default implementations as little as possible.

eiriktsarpalis commented 6 months ago

If we want to run this experiment this year it would probably need to be merged before the end of January in time for Preview 1. Any volunteers willing to source an implementation?

joshudson commented 6 months ago

If nobody else steps up by Jan 2, I'll try it; but an earlier attempt would be better. Sorry, I cannot do it this month.

terrajobst commented 6 months ago

@joshudson

Sorry, I cannot do it this month.

No need to apologize for not being able to volunteer your time to an OSS project, especially over the holiday period :-)

TylerBrinkley commented 6 months ago

I'll try my hand at it with #95830.

eiriktsarpalis commented 2 months ago

Change got reverted in https://github.com/dotnet/runtime/pull/101644, reopening to keep track of next steps.

mikernet commented 2 months ago

😭

What are the next steps?

RenderMichael commented 2 months ago

With respect to C++/CLI (I’ve never used it but it sounds neat), this change is not a breaking change in any meaningful way, and is only breaking on C++/CLI due to a limitation in their type system.

I’m not sure if it’s considered a bug to be fixed or a new feature that requires some cycles to design and implement, but I do hope this can be resolved on C++/CLI’s side and the change go through again. It would be a colossal disappointment if this change gets denied.

huoyaoyuan commented 2 months ago

Previously DIM in static interface method were blocked by C++/CLI too and temporarily reverted until RTM. Hopefully C++/CLI can unblock on this too. I'm OK to see this change delayed to .NET 10, but unwilling to see it blocked forever.

joshudson commented 2 months ago

On considering the failure mode in C++/CLR; I am reasonably convinced it is not an ABI breaking change but only an API breaking change (that is, it only fails at compile time). That being said, the required source fix is ugly and something should be done.

On further analysis, this seems to be a bug in the C++/CLR compiler. The problem happened upon adding IList : IReadOnlyList and a type was found that implements IList but doesn't expose a public method called getCount itself.

If the type follows CLR rules, it doesn't have a public method called getCount; therefore the type must follow C++ rules. On synthesizing the C++ view of the combined type I got:

#include <iostream>

class IEnumerable {
};

class IReadOnlyList: public virtual IEnumerable {
public:
    int getCount() { return 3; }
};

class IList: public virtual IEnumerable {
public:
    int getCount() { return 3; }
};

class MyList: public virtual IList {};

int main()
{
    MyList l;
    std::cout << l.getCount() << "\n";
}

Which builds and runs; the C++/CLR compiler is behaving as though the change imputed the following change to the code: class MyList: public virtual IList, public virtual IReadOnlyList {}; where MyList only declares an implementation of IList

kasperk81 commented 2 months ago

it's about a simple static cast https://github.com/dotnet/wpf/pull/9055#issuecomment-2080391288

TylerBrinkley commented 1 month ago

@eiriktsarpalis Any next steps on this yet? Is the path to fix the incompatibilities in C++/CLI in a general solution or perhaps just target the API's where this change causes issues? Thanks.

KennethHoff commented 1 month ago

Is there intention to try to get this back in time for a .Net 9 Preview? Is it being delayed to .Net 10+, or is it dead?

The various discussions I've seen/heard seem to indicate that this is just a temporary issue when C++/CLI, but how temporary are we talking? I hope it's not as temporary as the lack of adoption for C++ modules 😅

terrajobst commented 1 month ago

It's not dead yet but it depends on whether or not we can control the failure cases. At this point .NET 9 is extremely unlikely.