dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.06k stars 4.69k forks source link

Please add interface IReadOnlySet and make HashSet, SortedSet implement it #2293

Closed dsaf closed 4 years ago

dsaf commented 9 years ago

Original

Since IReadOnlyList was added the parity between sets and lists has declined. It would be great to re-establish it.

Using it would be an implementation-agnostic way of saying: "here is this read-only collection where items are unique".

Clearly it's needed by many people:

SQL Server: https://msdn.microsoft.com/en-us/library/gg503096.aspx Roslyn: https://github.com/dotnet/roslyn/blob/master/src/Compilers/Core/Portable/InternalUtilities/IReadOnlySet.cs Some Guy In Comments: http://blogs.msdn.com/b/bclteam/archive/2013/03/06/update-to-immutable-collections.aspx

I found this discussion when working on a real world problem where I wanted to use the key collection of a dictionary for read only set operations. In order to support that case here's the API I propose.

Edit

Rationale

Proposed API

 namespace System.Collections.Generic {
+    public interface IReadOnlySet<out T> : IReadOnlyCollection<T>, IEnumerable, IEnumerable<T> {
+        bool Contains(T value);
+        bool IsProperSubsetOf(IEnumerable<T> other);
+        bool IsProperSupersetOf(IEnumerable<T> other);
+        bool IsSubsetOf(IEnumerable<T> other);
+        bool IsSupersetOf(IEnumerable<T> other);
+        bool Overlaps(IEnumerable<T> other);
+        bool SetEquals(IEnumerable<T> other);
+    }
-    public class HashSet<T> : ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T> {
+    public class HashSet<T> : ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T>, IReadOnlySet<T> {
     }
-    public class SortedSet<T> : ICollection, ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T> {
+    public class SortedSet<T> : ICollection, ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T>, IReadOnlySet<T> {
     }
+    public class ReadOnlySet<T> : ICollection<T>, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, IReadOnlySet<T>, ISet<T> {
+        public int Count { get; }
+        public ReadOnlySet(ISet<T> set);
+        public bool Contains(T value);
+        public bool IsProperSubsetOf(IEnumerable<T> other);
+        public bool IsProperSupersetOf(IEnumerable<T> other);
+        public bool IsSubsetOf(IEnumerable<T> other);
+        public bool IsSupersetOf(IEnumerable<T> other);
+        public bool Overlaps(IEnumerable<T> other);
+        public bool SetEquals(IEnumerable<T> other);
+    }
     public class Dictionary<TKey, TValue> {
-        public sealed class KeyCollection : ICollection, ICollection<TKey>, IEnumerable, IEnumerable<TKey>, IReadOnlyCollection<TKey> {
+        public sealed class KeyCollection : ICollection, ICollection<TKey>, IEnumerable, IEnumerable<TKey>, IReadOnlyCollection<TKey>, IReadOnlySet<TKey> {
+            public bool IsProperSubsetOf(IEnumerable<TKey> other);
+            public bool IsProperSupersetOf(IEnumerable<TKey> other);
+            public bool IsSubsetOf(IEnumerable<TKey> other);
+            public bool IsSupersetOf(IEnumerable<TKey> other);
+            public bool Overlaps(IEnumerable<TKey> other);
+            public bool SetEquals(IEnumerable<TKey> other);
         }
     }
 }

Open Questions

Updates

scalablecory commented 6 years ago

I don't think you're going to be able to put a comparer on IReadOnlySet.

SortedSet takes an IComparer, while HashSet takes an IEqualityComparer.

aaron-meyers commented 6 years ago

Good point, the comparer could be considered an implementation detail that doesn't belong on a general interface.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Cory Nelson notifications@github.com Sent: Thursday, May 10, 2018 5:04:06 PM To: dotnet/corefx Cc: Aaron Meyers; Mention Subject: Re: [dotnet/corefx] Please add interface IReadOnlySet and make HashSet, SortedSet implement it (#1973)

I don't think you're going to be able to put a comparer on IReadOnlySet.

SortedSet takes an IComparer, while HashSet takes an IEqualityComparer.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fcorefx%2Fissues%2F1973%23issuecomment-388221165&data=02%7C01%7C%7C0ef6d84125be4c450fdc08d5b6d2b70a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636615938478979295&sdata=PHkDQPiJBEIks8yNyIA7vKODM%2BkMFI4PRX4lUUBu%2Bi0%3D&reserved=0, or mute the threadhttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAMuQLu5JGLcqrpMyGWLygbCsaSQSXFgNks5txNV2gaJpZM4E9KK-&data=02%7C01%7C%7C0ef6d84125be4c450fdc08d5b6d2b70a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636615938478979295&sdata=9pnuMULuDu9HWb7un%2FWYq6iYdjTKFsjN7nKiToaeHkk%3D&reserved=0.

scalablecory commented 6 years ago

There may be some value in adding IUnorderedSet and IOrderedSet interfaces, but I wouldn't want that to derail pushing IReadOnlySet through.

NickRedwood commented 6 years ago

I like @pgolebiowski 's suggestion but would make that interface even more basic.

public interface ISetCharacteristic<in T>
{
    bool Contains(T item);
}

This is then suitable for uncountable sets such as Intervals (real) as well. Between that and IReadOnlySet you could potentially fit in a few other interfaces such as ICountableSet (aka IEnumerableSet), IUnorderedSet and IOrderedSet.

But agree that just IReadOnlySet would be a large improvement over what's currently available.

pgolebiowski commented 6 years ago

+1 for @NickRedwood

It's been open for over 3 years now. Please make this happen by changing the approach. Introducing the change mentioned by @NickRedwood allows for all the other enhancements that you are discussing in this thread. All the approaches agree about this single trivial thing. So let's add this single trivial thing which is fundamental and then create another issue for potential enhancements that are all optional.

@TylerBrinkley @safern @karelz @terrajobst

TylerBrinkley commented 6 years ago

I don't really see that much benefit to an IReadOnlySet<T> interface with only a Contains method. While that method is useful it's already included in the ICollection<T> interface which also has an IsReadOnly property intended to indicate whether the modification methods of ICollection<T> are supported. Just implement ICollection<T> and make IsReadOnly return true. The only other functionality it would be expected to support would be a Count property, being enumerable, and a CopyTo method which doesn't seem too limiting.

NickRedwood commented 6 years ago

I would be very happy for the proposed IReadOnlySet<T> to be implemented based on IReadOnlyCollection<T> - it would be a large improvement on what's currently available. (@TylerBrinkley - assume you meant IReadOnlyCollection<T> rather than ICollection<T>.

I guess I err on the side of more basic interfaces, and if there was one for Set then it would be a single-method-interface with a Contains method. It has a good mathematical basis and is very clean. However when retrofitting to existing classes I accept that retrofitting less is better, and if there was to be only one interface retrofitted then IReadOnlySet<T> based on IReadOnlyCollection<T> should be it.

jnm2 commented 6 years ago

IReadOnlyCollection<> doesn't have .Contains because that would prevent it from being covariant.

NickRedwood commented 6 years ago

Good point and I've corrected my earlier post to be contravariant. IReadOnlySet<T> would need to be invariant unless the approach posted early in the thread is used, and I'm unsure about that one.

MatthewLymer commented 6 years ago

@TylerBrinkley personally I'm not a fan of the IsReadOnly property, I would much rather have this information known at compile-time instead of run-time. If you were referring to the IReadOnlyCollection<T> interface then I still think it would be useful for the caller or callee to know that the lookup will be fast instead of a potential iteration though a collection, though I'm not certain if a "set" implies that.

TylerBrinkley commented 6 years ago

Having just a Contains method doesn't define what a Set should do, just what a Collection should do. I mean Contains is not even a member of ISet but of ICollection which ISet inherits from. The other members of ISet should be required to define what a Set should do. I understand most people probably use sets exclusively for it's Contains method but there's so much more to sets than just that.

binki commented 6 years ago

@TylerBrinkley But is it even possible to define IReadOnlyCollection<T>.Contains(T) at this point? I assumed not. That’s why the only option is to introduce IReadOnlySet<T> and, when introduced, ensure that IReadOnlySet<T>.Contains(T) is declared on it.

@jnm2 says:

IReadOnlyCollection<> doesn't have .Contains because that would prevent it from being covariant.

This seems to be the biggest issue to me right now. It would be odd for IReadOnlySet<T> to be invariant when IReadOnlyCollection<T> is covariant. It seems like many existing IReadOnly* interfaces were carefully deisgn to be out T with the writable interfaces being necessarily invariant. Most of us would want IReadOnlySet<T> to at least declare a Contains(T) method and yet that would prevent it from being covariant.

If ICollection really is the right place for Contains(T) to be defined, is that possible? Maybe IReadOnlyProbableCollection<T>.Contains(T) which would be invariant and implemented by Collection<T> and HashSet<T>?

I think @NickRedwood’s suggestion of ISetCharacteristic<T> seems cleaner maybe. That even has the benefit of allowing you not to implement Count which could be useful.

pgolebiowski commented 6 years ago

personally I'm not a fan of the IsReadOnly property, I would much rather have this information known at compile-time instead of run-time.

That man speaks well, pour him more wine.

TylerBrinkley commented 6 years ago

@binki I agree there needs to be a Contains method on IReadOnlySet<T> since IReadOnlyCollection<T> did not include one, however I also think all the other non-mutating ISet<T> methods should be included.

Besides the Dictionary.KeyCollection use case I mentioned above what other use cases can you come up with for adding this interface?

pgolebiowski commented 6 years ago

OK, looks like the API design is:

public interface IReadOnlySet<T> : IReadOnlyCollection<T> {    
    bool Contains(T item);
}

It is the only common part of the design that everyone seems to be able to agree to.

I SAY THAT NOW WE END THE DESIGN HERE. If you want to extend it in the future, do it in a different issue. This is the most important functionality that should be in prod.

Who is for, who is against?

nawfalhasan commented 6 years ago

I like @ashmind 's suggestion. It would be awesome if Dictionary.Keys can implement this interface as it is technically IReadOnlySet<TKey>.

pgolebiowski commented 5 years ago

Could you please mark this as api-ready-for-review? I've just found myself needing this interface again and it is still not there.

fr0 commented 5 years ago

Why isn't this implemented yet?

karelz commented 5 years ago

[EDIT] Apologies - my original reply was mix up with another API. I don't have strong opinion about this one, text edited. Sorry for confusion!

Why isn't this implemented yet?

The API didn't get attention by area owners yet - @safern. I am not sure how high it is on Collections priority list. The upvote number is fairly high. The truth is that we are now locking down for .NET Core 3.0, so it will have to wait for next release at minimum as it is not critical for 3.0.

natalie-o-perret commented 5 years ago

I'm surprised too that this is not provided out of the box.

adamt06 commented 5 years ago

Immutable data structures can be a massive help towards improving code maintainability, but they are hard to use without an interface in the standard libraries that specifies semantics and allows different implementations to work together.

Without this, each library meaning to use an immutable set as a parameter/return value needs to resort to either using IEnumerable which is less efficient and semantically incorrect or using their own interfaces/classes in their signatures which is incompatible with anyone else's.

I'm curious if there are any workarounds for this that are efficient in terms of Contains lookups and avoid specifying the concrete type of the parameter/return value. The best I could think of off the top of my head would be to use IEnumerable in the signature and pass ReadOnlyDictionary.Keys, but that seems a bit nasty, especially in a library. Since Enumerable.Contains in Linq will use the collection implementation of Contains, this should be efficient while compatible, but does not communicate the intent which may lead to using less performant implementations.

scalablecory commented 5 years ago

@adamt06 You're looking for ISet<T>.

adamt06 commented 5 years ago

@scalablecory You are right, with an immutable implementation, and there is one: ImmutableHashSet<T>

generik0 commented 5 years ago

Does anyone know/understand why ICollection does not extend IReadOnlyCollection ?? I thought this was a little related to this thread. Instead of me opening a new thread. Maybe I am just misunderstanding something.

Another thought, but completely off topic, And why doesn't ReaOnlyCollection have:

        bool Contains(T item);
        void CopyTo(T[] array, int arrayIndex);
jnm2 commented 5 years ago

Something having to do with breaking changes, but I don't remember the details.

ReadOnlyCollection does have those things: https://docs.microsoft.com/en-us/dotnet/api/system.collections.objectmodel.readonlycollection-1.contains https://docs.microsoft.com/en-us/dotnet/api/system.collections.objectmodel.readonlycollection-1.copyto

jnm2 commented 5 years ago

Oh, I see you meant IReadOnlyCollection doesn't have them. It's because otherwise the interface could not be covariant.

MatthewLymer commented 5 years ago

I'm not sure, but it's probably something to do with explicit interface implementation causing problems with backwards compatibility. I can see why updating the existing interfaces to inherit from other interfaces from being a problem, but I don't see why creating a new interface IReadOnlySet<T> and having HashSet implement it would be an issue.

generik0 commented 5 years ago

OK. But is still also do not see ICollection should not be:

ICollection<T> : IReadOnlyCollection<out T>

Why should a read/write collection not also be an extension of a readonly collection?

binki commented 5 years ago

@generik0

OK. But is still also do not see ICollection should not be:

ICollection<out T> : IReadOnlyCollection<out T>

Why should a read/write collection not also be an extension of a readonly collection?

First, please note that it is impossible to declare ICollection<out T> because ICollection<T> has a member .Add(T item).

Second, note that ICollection<T> shows up in .net-2.0 and IReadOnlyCollection<out T> shows up in .net-4.5. If you compile code as implementing ICollection<T> and then ICollection<T> is changed to implement a new interface, any existing compiled binaries will break at runtime because anything only implementing ICollection<T> would not have its IReadOnlyCollection<T> slots filled in (possibly automatically) by the compiler. This is the compatibility issue that prevented ICollection<T> from implementing IReadOnlyCollection<T>.

I don’t think this is clearly addressed by this SO answer, but there is an SO for it: https://stackoverflow.com/a/14944400 .

generik0 commented 5 years ago

@binki "ICollection" fixed to ICollection, thanks for pointing out my typo.. DotNetStandard i net461. And this is where the change should be. netstandard 2+

I will repeat... Why should a read/write collection not also be an extension of a readonly collection?

And why should one need to cast a collection to e.g. "ToArray()" just to expose less if it?

And ofcause you can add new interfaces in higher versions without breaking things. ICollection already has teh IReadOnlyCollection methods implemented. Noting should break,... :-/

jnm2 commented 5 years ago

@generik0 Looks like it wouldn't break source compatibility which is what you're thinking of [wasn't thinking; it would], but it would break binary compatibility which works with IL tables, not C#.

generik0 commented 5 years ago

Ok @jnm2 thanks. I will stop my off topic rant now, just think it is bad architecture. Thanks everyone for listening / explaining:-)

binki commented 5 years ago

@jnm2

@generik0 Looks like it wouldn't break source compatibility which is what you're thinking of, but it would break binary compatibility which works with IL tables, not C#.

To nitpick (sorry), it would also break source compatibility if your source explicitly implemented ICollection<T> from prior to .net-4.5. The class would start failing to compile due to a failure to explicitly implement IReadOnlyCollection<T>.Count. But the binary compatibility is a bigger deal because it would prevent you from targeting a wide range of .net versions (you’d have to have different binaries for running on ≥.net-4.5 than <.net-2 regardless and you’d have to target both instead of just targeting the older framework).

TylerBrinkley commented 5 years ago

I'm wondering if with the addition of default interface implementation support in C#8 if ICollection<T> could implement IReadOnlyCollection<T> for .NET Core 3.0+.

generik0 commented 5 years ago

Maybe an extension is needed, then it is atleat the same collection. i.e. if you need to get an ienumerable or icollection to a reaadonlycollection ... Any thoughts?

[edited by @jnm2 comment]

public static class CollectionExtensions
    {
        public static IReadOnlyCollection<T> CastAsReadOnlyCollection<T>(this IEnumerable<T> collection)
        {
            if((collection is IReadOnlyCollection<T> readOnlyCollection ))
            {
                return readOnlyCollection ;
            }
            throw new ArgumentException($"{collection.GetType()} not supported as readonly collection");
        }
    }
jnm2 commented 5 years ago

@generik0 Why not use enumerable as IReadOnlyCollection<T> ?? throw ... or (IReadOnlyCollection<T>)enumerable instead of calling that method?

generik0 commented 5 years ago

@jnm2 omg. You are right. Sometimes you don’t see the clear picture and overcomplicate things!!!

I would still call the extension, just to get the simplicity ;-) Thanks mate....

oliverjanik commented 4 years ago

What's the status here? I'd really like this interface in standard lib and have HashSet<> implement it.

NickRedwood commented 4 years ago

It seems like our best bet may be to wait a while longer for the Shapes proposal to be (hopefully) implemented. Then you'd be able to build whatever group of shapes to represent Set types that you wanted, and provide implementations so that existing sets like HashSet<T> conform to them.

A community library to do exactly this could then emerge, covering all the various types of sets (intensional (predicates) and extensional (listed), ordered, partially ordered, unordered, countable, countably infinite, uncountable infinite, possibly mathematic domains too like Rational, Natural numbers etc) as different shapes, with all the union, intersection, cardinality methods defined for and between these sets.

oliverjanik commented 4 years ago

That's sounds to me like 5 years down the road. Why should a simple change that can be implemented in a day wait for some 1000x larger not-yet-specified feature that might not even happen?

NickRedwood commented 4 years ago

That's sounds to me like 5 years down the road. Why should a simple change that can be implemented in a day wait for some 1000x larger not-yet-specified feature that might not even happen?

I'm just responding to the lack of progress on IReadOnlySet<T> - this issue has already been open 4 years after all.

CodeAngry commented 4 years ago

The Microsoft way: The most simple and useful things take decades. It's mind blowing. 5 years and counting.

What's even more funny is that they have it in here https://docs.microsoft.com/en-us/dotnet/api/microsoft.sqlserver.management.sdk.sfc.ireadonlyset-1

danmoseley commented 4 years ago

@terrajobst thoughts?

terrajobst commented 4 years ago
 namespace System.Collections.Generic {
+    public interface IReadOnlySet<out T> : IReadOnlyCollection<T>, IEnumerable, IEnumerable<T> {
+        bool Contains(T value);
+        bool IsProperSubsetOf(IEnumerable<T> other);
+        bool IsProperSupersetOf(IEnumerable<T> other);
+        bool IsSubsetOf(IEnumerable<T> other);
+        bool IsSupersetOf(IEnumerable<T> other);
+        bool Overlaps(IEnumerable<T> other);
+        bool SetEquals(IEnumerable<T> other);
+    }
-    public class HashSet<T> : ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T> {
+    public class HashSet<T> : ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T>, IReadOnlySet<T> {
     }
-    public class SortedSet<T> : ICollection, ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T> {
+    public class SortedSet<T> : ICollection, ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T>, IReadOnlySet<T> {
     }
 }
MatthewLymer commented 4 years ago

Do it, I dare you!

svick commented 4 years ago

@terrajobst

We can't make ISet<T> extend IReadOnlySet<T> (for the same reason that we couldn't do for the other mutable interfaces).

Is this still true even with default interface methods? Does that mean https://github.com/dotnet/corefx/issues/41409 should be closed?

terrajobst commented 4 years ago

@terrajobst

We can't make ISet<T> extend IReadOnlySet<T> (for the same reason that we couldn't do for the other mutable interfaces).

Is this still true even with default interface methods? Does that mean dotnet/corefx#41409 should be closed?

We discussed this. We used to think that that DIMs would work, but when we walked the solution we concluded that it would result commonly in a shard diamond which would result in an ambiguous match. However, this was recently challenged so I think I have to write it down and make sure it's actually working or not working.

Jlalond commented 4 years ago

@terrajobst / @danmosemsft Has Anyone been assigned to this?

And, @terrajobst to clarify the work we want to achieve is:

We should also implement IReadOnlySet<T> on ImmutableSortedSet<T> and ImmutableHashSet<T> (and their builders)
We should scan for other implementations of ISet<T>.

As well as implementing the above interfaces on HashSet, SortedSet.

The scanning being just look for anything and bring it up if it looks questionable.

If this is still up for grabs I'd be interested

danmoseley commented 4 years ago

@Jlalond nope, assigned to you. Thanks for the offer.

Jlalond commented 4 years ago

@danmosemsft @terrajobst Just an update. I'm working on this, added the interface to private core Lib, just stumbling my way through having collections.generic and immutable pick up on this.

Last question if you know off your head Dan, do I need to make any changes to Mono for this? I'm not insightful on where corefx ends and mono starts. So if you know it could save me from some independent research