dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.06k stars 4.69k forks source link

Please add interface IReadOnlySet and make HashSet, SortedSet implement it #2293

Closed dsaf closed 4 years ago

dsaf commented 9 years ago

Original

Since IReadOnlyList was added the parity between sets and lists has declined. It would be great to re-establish it.

Using it would be an implementation-agnostic way of saying: "here is this read-only collection where items are unique".

Clearly it's needed by many people:

SQL Server: https://msdn.microsoft.com/en-us/library/gg503096.aspx Roslyn: https://github.com/dotnet/roslyn/blob/master/src/Compilers/Core/Portable/InternalUtilities/IReadOnlySet.cs Some Guy In Comments: http://blogs.msdn.com/b/bclteam/archive/2013/03/06/update-to-immutable-collections.aspx

I found this discussion when working on a real world problem where I wanted to use the key collection of a dictionary for read only set operations. In order to support that case here's the API I propose.

Edit

Rationale

Proposed API

 namespace System.Collections.Generic {
+    public interface IReadOnlySet<out T> : IReadOnlyCollection<T>, IEnumerable, IEnumerable<T> {
+        bool Contains(T value);
+        bool IsProperSubsetOf(IEnumerable<T> other);
+        bool IsProperSupersetOf(IEnumerable<T> other);
+        bool IsSubsetOf(IEnumerable<T> other);
+        bool IsSupersetOf(IEnumerable<T> other);
+        bool Overlaps(IEnumerable<T> other);
+        bool SetEquals(IEnumerable<T> other);
+    }
-    public class HashSet<T> : ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T> {
+    public class HashSet<T> : ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T>, IReadOnlySet<T> {
     }
-    public class SortedSet<T> : ICollection, ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T> {
+    public class SortedSet<T> : ICollection, ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T>, IReadOnlySet<T> {
     }
+    public class ReadOnlySet<T> : ICollection<T>, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, IReadOnlySet<T>, ISet<T> {
+        public int Count { get; }
+        public ReadOnlySet(ISet<T> set);
+        public bool Contains(T value);
+        public bool IsProperSubsetOf(IEnumerable<T> other);
+        public bool IsProperSupersetOf(IEnumerable<T> other);
+        public bool IsSubsetOf(IEnumerable<T> other);
+        public bool IsSupersetOf(IEnumerable<T> other);
+        public bool Overlaps(IEnumerable<T> other);
+        public bool SetEquals(IEnumerable<T> other);
+    }
     public class Dictionary<TKey, TValue> {
-        public sealed class KeyCollection : ICollection, ICollection<TKey>, IEnumerable, IEnumerable<TKey>, IReadOnlyCollection<TKey> {
+        public sealed class KeyCollection : ICollection, ICollection<TKey>, IEnumerable, IEnumerable<TKey>, IReadOnlyCollection<TKey>, IReadOnlySet<TKey> {
+            public bool IsProperSubsetOf(IEnumerable<TKey> other);
+            public bool IsProperSupersetOf(IEnumerable<TKey> other);
+            public bool IsSubsetOf(IEnumerable<TKey> other);
+            public bool IsSupersetOf(IEnumerable<TKey> other);
+            public bool Overlaps(IEnumerable<TKey> other);
+            public bool SetEquals(IEnumerable<TKey> other);
         }
     }
 }

Open Questions

Updates

whoisj commented 9 years ago

Wouldn't be nice if we just had some language construct to make things immutable? Then we would not have to have these magic interfaces.

HaloFour commented 9 years ago

@whoisj Which language? The CLR has dozens of them.

Even as a potential language feature it would require a metadata representation. For this case a marker interface (or behavioral interface) is as good as any. Trying to convey the immutability of a collection type through a new metadata entry doesn't seem appropriate since the CLR shouldn't be making assumptions as to how an arbitrary class functions internally (and the CLR has no concept of collection classes at all aside for arrays).

dsaf commented 9 years ago

@whoisj I think this is at least considered for one of the future C# versions. But that decision doesn't affect the need for symmetrical interfaces across all collections. Furthermore I can imagine scenarios where a readonly list of mutable items could be useful, e.g. in games that care about both code quality and performance.

Also immutable collections are already available:

https://msdn.microsoft.com/en-us/library/system.collections.immutable(v=vs.111).aspx

To achieve a fully immutable collection we just need a way of defining an immutable T and then use it to declare an Immutable...<T> collection.

whoisj commented 9 years ago

@HaloFour we've been down this road before :smirk: but I still believe the CLR needs a way to say "here's a handle, read from it but all actions that cause any kind of write through it will fail; oh and this immutability is contagious so any handle reached from the immutable handle is also immutable (including this)".

@dsaf absolutely! In another issue I proposed that we have a writable anti-term for readdonly to enable the use of readonly collection of writeble elements. Something along the lines of readonly Bag<writable Element>.

I suggested that any reference marked with a & be treated as immutable by the compiler. I still feel it only need be a compile time check, and necessarily enforced by the CLR itself as I'm mostly seeking compile time verification of logic and not run-time guarantees. This would cover any reference which developers wanted to be immutable, but on a per call basis.

HaloFour commented 9 years ago

@whoisj Perhaps, but that's pretty tangential and it turns this request from something dsaf could branch/PR this afternoon into something that involves effort across three different teams.

You're also treating this as a compiler concern. At this point there isn't a compiler involved (beyond the JIT compiler) and only the verifier can attempt to prevent "improper" code from executing. Even the existing runtime mechanisms of immutability, initonly fields, can be easily defeated if verification is skipped (or via reflection).

I do agree that it would be nice if the C# language and compiler can have better support for "pure" methods. The attribute PureAttribute already exists but it's used sporadically and there really isn't any language support for it. And even if C# did support it through compiler errors (pure can only call pure, etc.), it's very easily defeated by using a different language. But these methods have to announce themselves and enforce themselves as once compiled to IL all bets are basically off and none of the compilers can bend how an existing assembly executes.

whoisj commented 9 years ago

@HaloFour fair.

Assuming we don't have a general way to support "pure" or "const" references, then I suppose the proposed is the best alternative.

yanggujun commented 9 years ago

If you need it now, my Commons library (Commons.Collections https://github.com/yanggujun/commonsfornet/tree/master/src/Commons.Collections/Set ) has the readonly set support. Admin please delete this post if this is thought as an advert... My suggestion is to look around for some open source implementation.

dsaf commented 9 years ago

@yanggujun Thank you for suggestion, that seems like a nice library, but I will roll my own to avoid extra dependencies.

My suggestion is to look around for some open source implementation.

This is a work-around, fundamental interfaces like IReadOnlySet should really be a part of .NET Framework itself.

ashmind commented 9 years ago

Does this need a speclet to become "ready for api review"?

GiottoVerducci commented 8 years ago

And while we're at it, consider naming it something different than "ReadOnly" (see interesting post: http://stackoverflow.com/questions/15262981/why-does-listt-implement-ireadonlylistt-in-net-4-5) "Readable" seems fine.

dsaf commented 8 years ago

@GiottoVerducci No. I would prefer to keep a consistent naming pattern even if it is imperfect. You are free to raise a separate issue to rename existing interfaces though.

ashmind commented 8 years ago

Proposed API design:

public interface IReadOnlySet<T> : IReadOnlyCollection<T> {    
    bool Contains(T item);
    bool IsSubsetOf(IEnumerable<T> other);
    bool IsSupersetOf(IEnumerable<T> other);
    bool IsProperSupersetOf(IEnumerable<T> other);
    bool IsProperSubsetOf(IEnumerable<T> other);
    bool Overlaps(IEnumerable<T> other);
    bool SetEquals(IEnumerable<T> other);
}

This is based on ISet<> API (except mutation methods obviously).

It's a pity Comparer does not fit in this, but then neither ISet<> nor IReadOnlyDictionary<> expose comparers, so it's too late to fix now.

binki commented 8 years ago
    bool Contains(T item);

Shouldn’t this be in IReadOnlyCollection<T> instead since ICollection<T> has Contains(T item)?

adamt06 commented 8 years ago

The immutable collections package has been unlisted from nuget while still beta. I think this is a pretty common use case and should be handled in standard libs.

drewnoakes commented 8 years ago

Is there more work to do on the API here, as the tag suggests? I'm happy to spend some time on this if it'd be helpful and someone can point out what's needed.

The API @ashmind proposed looks great.

Can ISet<T> be made to extend IReadOnlySet<T>? This didn't happen IList<T>/IReadOnlyList<T>?

If not, then I suppose the other changes to consider are adding IReadOnlySet<T> to the interface list for all ISet<T> implementations in corefx including HashSet<T>, SortedSet<T> and their immutable counterparts in System.Collections.Immutable.

HaloFour commented 8 years ago

I have to agree with @GiottoVerducci . Using a name like IReadOnlySet<T> doesn't declare the contracts capabilities, it declares the contracts limitations. To then use that same contract combined with another that contradicts those limitations is confusing. I believe that the contract name should describe a positive assertion pertaining to what the implementer supports. A name like IReadableSet<T> isn't great, admittedly, but it at least better describes what the implementer does.

drewnoakes commented 8 years ago

@HaloFour I agree in principle, but we have the same situation now with IReadOnlyList<T>. Maintaining consistency trumps the increase in precision here, IMHO.

HaloFour commented 8 years ago

@drewnoakes

I understand, and consistency is important. I think that also answers why ISet<T> shouldn't extend IReadOnlySet<T>, though.

binki commented 8 years ago

Maintaining consistency trumps the increase in precision here, IMHO.

I think that also answers why ISet<T> shouldn't extend IReadOnlySet<T>, though.

I think you’re missing the point. That’s the reason that IList<T>, ICollection<T>, IDictionary<TKey, TValue> should, in addition to ISet<T>, also be fixed to implement read-only view interfaces. Otherwise everyone has to continue to be confused when working around the unintuitive design of the BCL.

HaloFour commented 8 years ago

@binki

I don't disagree. What I don't like about that is having a contract that stipulates read-only behavior being extended by a contract that stipulates read-write behavior. The naming is wrong and the composition is wrong. But here we are. I'd love to vote to change both, but I doubt such is on the table.

binki commented 8 years ago

@HaloFour

When you get an interface into something, it’s a view into something. The view itself is read-only. Assuming you wrote type-safe code and won’t go around upcasting, if you receive something that is read-only, it is, for all intents and purposes, read-only. That’s no guarantee that the data won’t change. It’s just like opening a file read-only. A file opened read-only can be mutated by another process. Or like read-only access to pages on a website where an administrator would have a read-write view into the data and can change it out from under you.

I’m not sure why read-only is considered the wrong term here. Read-only does not imply immutable. There’s a whole nuget package/different API (where adding/removing generates a new object and the current instance is guaranteed to never mutate—thus being immutable) for that if that’s what you require.

drewnoakes commented 8 years ago

I was thinking something similar. "Read only" in .NET is a pretty weak guarantee for fields too. Given a do-over, I'm sure all this would make more sense. For now it's worth being pragmatic.

So in general, if a method accepts an IReadOnlySomething<T> you can, in general, assume that it won't modify it. There's no guarantee the receiving method won't upcast the reference, and there's no guarantee that the interface's implementation won't internally modify itself when accessed either.

In C++, const_cast weakens the guarantees of const too, which is a shame (esp nowadays with the mutable modifier) but in practice it doesn't take away from how useful const is as an feature. You just have to know what you're dealing with.

@binki makes a good distinction. Immutable in the name implies a hard guarantee of stability over time for all involved.

Does anyone have an authoritative source as to why IList<T> doesn't extend IReadOnlyList<T>?

HaloFour commented 8 years ago

@binki

An interface isn't a view, it's a contract. That contract declares the capabilities of the implementer. If the implementer doesn't actually implement those capabilities I would consider that a contract violation. That List<T> class claims that it "is-a" IReadOnlyList<T>, but it's not. It lacks that capability.

There are multiple schools of thought on this subject. I clearly belong to the school where interface inheritance more strictly follows "is-a" relationships between types. I certainly support a more granular approach to composition with interfaces and think that List<T> and its kin could probably benefit from implementing some 3-4 additional interfaces (read, write, append, etc.) But I certainly think that the name of an interface should describe what a type can do, not what it can't do. Negative capability assertions don't make much sense for contracts.

@drewnoakes

For now it's worth being pragmatic.

I agree. We are where we are. If IList<T> were to be changed to extend IReadOnlyList<T> then it makes sense for ISet<T> to be changed to extend IReadOnlySet<T>, etc.

Is it being too redundant to also push for IReadableX<T>, IWritableX<T>, etc. interfaces to live alongside IReadOnlyX<T>?

binki commented 8 years ago

Does anyone have an authoritative source as to why IList<T> doesn't extend IReadOnlyList<T>?

Apparently it would be an ABI-breaking change when loading assemblies that were compiled against older .net frameworks. Because when implementing an interface, most compilers will automatically generate explicit interface implementations when the source code relies on implicit interface implementation, if you compiled your class implementing IList<T> against a BCL that doesn’t have IList<T> implementing IReadOnlyList<T>, the compiler won’t automatically create the explicit IReadOnlyList<T> implementations. If I’m reading this right: http://stackoverflow.com/a/35940240/429091

jnm2 commented 8 years ago

@HaloFour Since List<> and HashSet<> implement ICollection<> and IReadOnlyCollection<>, we've already embraced a path where IReadOnly refers to access and not capability. Based on that, having IAnything extend IReadOnlyAnything makes perfect sense. I agree that IReadable is better than IReadOnly but at this point everyone understands IReadOnly to mean IReadable and uses it as such. In fact, I'm perpetuating that intentionally in my own codebase because having two ways of thinking about things is more cognitive load than anything in my opinion.

We're stuck with the name, but the concept behind it is powerful enough that I truly wish it was possible for all interfaces to extend IReadOnly going forward, just like we do with concrete classes.

jnm2 commented 8 years ago

@ashmind I think it's perfect that none of the methods take comparers. In sets and dictionaries, comparers aren't something you can easy swap out because they determine the structure of the entire object. Also, it wouldn't make sense to pass a comparer to a CaseInsensitiveStringCollection or any collection that implies a certain comparison.

(In the case of a weird collection that does implement Contains(T, IEqualityComparer<T>) more efficiently than the extension method that's already available, it would probably be a one-off class method. It's hard to imagine Contains(T, IEqualityComparer<T>) being common enough to end up in a specialized interface, but there's nothing stopping even that from happening.)

ashmind commented 8 years ago

@jnm2

I think it's perfect that none of the methods take comparers.

Just to clarify, I wanted to say that it should expose comparer, not take one. Since every Set or Dictionary must have some equality algorithm, this could have been exposed on the interface. But I don't remember my use case for that now -- something like creating a set using the same comparer as in an externally provided one.

aaron-meyers commented 7 years ago

While this discussion brings up lots of interesting points, it seems to be straying far from the simple and obvious suggestion that started this thread. And that is discouraging because I would really like to see this issue addressed.

As the OP said, the failure to maintain parity among collection types when IReadOnlyList was added without IReadOnlySet is unfortunate and many people have implemented their own versions of IReadOnlySet interface as workarounds (my own team has a similar workaround). Those workaround interfaces are not ideal because the corefx classes can't implement them. This is the key reason for providing this in the framework: if I have a HashSet I would like to be able to use it as an IReadOnlySet without copying or wrapping the object I already have. For performance at least this is often desirable.

The name of the interface should clearly be IReadOnlySet. Consistency trumps any concerns with the IReadOnlyXXX names. That ship has sailed.

None of the existing interfaces (IReadOnlyCollection) can be changed. The back-compat requirements for .NET don't allow changes like that. It is unfortunate that Comparers aren't exposed in the existing IReadOnlyXXX interfaces (I've run into this as well) but again the ship has sailed.

The only question that seems to remain from a practical standpoint is between these two potential definitions of the interface.

Previously proposed by @ashmind :

public interface IReadOnlySet<T> : IReadOnlyCollection<T> {    
    bool Contains(T item);
    bool IsSubsetOf(IEnumerable<T> other);
    bool IsSupersetOf(IEnumerable<T> other);
    bool IsProperSupersetOf(IEnumerable<T> other);
    bool IsProperSubsetOf(IEnumerable<T> other);
    bool Overlaps(IEnumerable<T> other);
    bool SetEquals(IEnumerable<T> other);
}

Minimal proposal:

public interface IReadOnlySet<T> : IReadOnlyCollection<T> {    
    bool Contains(T item);
}

Personally I prefer this minimal proposal since the other methods can be derived; ideally there would be a standard implementation of those as extension methods over the IReadOnlySet interface so the implementors of IReadOnlySet don't need to provide them. I also feel this minimal proposal is more in line with the other minimal IReadOnlyXXX interfaces.

jnm2 commented 7 years ago

@aaron-meyers The only concern I would have is that IsSubsetOf and friends cannot be derived from Contains in an efficient manner. When it's two hash tables, for instance, relying on Contains forces the implementation to use nested loops rather than hash matching.

drewnoakes commented 7 years ago

Perhaps a new interface, IComparableSet<T> could contain the set operations.

We already have extension methods on IEnumerable<T> for a few set operations.

aaron-meyers commented 7 years ago

@jnm2 The implementation of these methods used by HashSet only requires Contains and enumerating the other collection (which IReadOnlySet would get by inheriting IReadOnlyCollection). It does require though knowing that the other set uses the same comparer. Perhaps it is worth adding the Comparer property to IReadOnlySet so that these operations can be implemented efficiently in the extension methods. It is unfortunate that IReadOnlyDictionary doesn't expose the KeyComparer as well but it may be worth adding this to IReadOnlySet even though it isn't entirely consistent. There are good reasons that it should have been included on IReadOnlyDictionary in the first place, as covered here.

The modified proposal would be:

public interface IReadOnlySet<T> : IReadOnlyCollection<T> {    
    IEqualityComparer<T> Comparer { get; }
    bool Contains(T item);
}
aaron-meyers commented 7 years ago

Alternately, the Comparer could be on a separate interface and the extension method implementations of the set operations would only use the efficient route if both objects implement the interface and have the same comparers. The same approach could be applied for IReadOnlyDictionary (in fact, perhaps they just use the same interface). Something like ISetComparable. Or drawing from @drewnoakes there could be an IComparableSet but instead of defining the set operators, it just defines the comparer:

public interface IComparableSet<T> : IReadOnlySet<T> {    
    IEqualityComparer<T> Comparer { get; }
}

In this case IReadOnlySet goes back to just defining Contains:

public interface IReadOnlySet<T> : IReadOnlyCollection<T> {    
    bool Contains(T item);
}
dmitriyse commented 7 years ago
public interface IReadOnlySet<T> : IReadOnlyCollection<T> {    
    bool Contains(T item);
}
public interface IReadOnlySetEx<T> : IReadOnlySet<T> {    
    bool IsSubsetOf(IEnumerable<T> other);
    bool IsSupersetOf(IEnumerable<T> other);
    bool IsProperSupersetOf(IEnumerable<T> other);
    bool IsProperSubsetOf(IEnumerable<T> other);
    bool Overlaps(IEnumerable<T> other);
    bool SetEquals(IEnumerable<T> other);
    IEqualityComparer<T> Comparer { get; }
}
public class HashSet<T>: IReadOnlySetEx<T>, ISet<T>
{
   // Improved implementation here.
}

public static class CollectionExtensiosn
{
    public static IEqualityComparer<T> GetComparer<T>(this IReadOnlySet<T> @set)
    {
           var setEx = @set as IReadOnlySetEx<T>;
           if (setEx == null)
           {
                throw new ArgumentException("set should implement IReadOnlySetEx<T> for this method.")
           }
           return setEx.Comparer;
    }

    public static bool IsSubsetOf<T>(this IReadOnlySet<T> @set, IEnumerable<T> other)
    {
         var setEx = @set as IReadOnlySetEx<T>;
         if (setEx != null)
         {
              return setEx.IsSubsetOf(other);
         }
         // Non optimal implementation here.
    }

    // The same approach for dictionary.
    public static IEqualityComparer<T> GetKeyComparer<T>(this IReadOnlyDictionary<T> dictionary)
    {
           var dictionaryEx = set as IReadOnlyDictionaryEx<T>;
           if (dictionaryEx == null)
           {
                throw new ArgumentException("dictionary should implement IReadDictionaryEx<T> for this method.")
           }
           return dictionaryEx.KeyComparer;
    }

}

We can use this approach. Usage will looks like this;

IReadOnlySet<string> mySet = new HashSet<string>();
bool test = mySet.IsSubsetOf(new []{"some", "strings", "set"}); // Extension method
var = mySet.GetComparer(); // Extension method

Many requirements satisfied, IReadOnlySet is minimalistic. But GetComparer now method, not property. But it's a good trade off.

dmitriyse commented 7 years ago
    /// <summary>
    /// Readable set abstracton. Allows fast contains method, also shows that collection items are unique by some criteria.
    /// </summary>
    /// <remarks>
    /// Proposal for this abstraction is discussed here https://github.com/dotnet/corefx/issues/1973.
    /// </remarks>
    /// <typeparam name="T">The type of elements in the set.</typeparam>
    public interface IReadOnlySet<out T> : IReadOnlyCollection<T>
    {
        /// <summary>
        /// Determines whether a <see cref="T:System.Collections.Generic.HashSet`1"/> object contains the specified
        /// element.
        /// </summary>
        /// <typeparam name="TItem">The type of the provided item. This trick allows to save contravariance and save from boxing.</typeparam>
        /// <returns>
        /// true if the <see cref="T:System.Collections.Generic.HashSet`1"/> object contains the specified element;
        /// otherwise, false.
        /// </returns>
        /// <param name="item">The element to locate in the <see cref="T:System.Collections.Generic.HashSet`1"/> object.</param>
        bool Contains<TItem>(TItem item);
    }

namespace System.Collections.Generic
{
    /// <summary>
    /// Provides the base interface for the abstraction of sets. <br/>
    /// This is full-featured readonly interface but without contravariance. See contravariant version
    /// <see cref="IReadOnlySet{T}"/>.
    /// </summary>
    /// <typeparam name="T">The type of elements in the set.</typeparam>
    public interface IReadableSet<T> : IReadOnlySet<T>
    {
        /// <summary>
        /// Gets the <see cref="Generic.IEqualityComparer{T}"/> object that is used to determine equality for the values
        /// in the set.
        /// </summary>
        /// <returns>
        /// The <see cref="Generic.IEqualityComparer{T}"/> object that is used to determine equality for the values in the
        /// set.
        /// </returns>
        IEqualityComparer<T> Comparer { get; }

        /// <summary>
        /// Determines whether a <see cref="T:System.Collections.Generic.HashSet`1"/> object contains the specified
        /// element.
        /// </summary>
        /// <returns>
        /// true if the <see cref="T:System.Collections.Generic.HashSet`1"/> object contains the specified element;
        /// otherwise, false.
        /// </returns>
        /// <param name="item">The element to locate in the <see cref="T:System.Collections.Generic.HashSet`1"/> object.</param>
        bool Contains(T item);

        /// <summary>
        /// Determines whether the current set is a proper (strict) subset of a specified collection.
        /// </summary>
        /// <returns><see langword="true"/> if the current set is a proper subset of <paramref name="other"/>; otherwise, false.</returns>
        /// <param name="other">The collection to compare to the current set.</param>
        /// <exception cref="ArgumentNullException">
        /// <paramref name="other"/> is null.
        /// </exception>
        bool IsProperSubsetOf(IEnumerable<T> other);

        /// <summary>Determines whether the current set is a proper (strict) superset of a specified collection.</summary>
        /// <returns>true if the current set is a proper superset of <paramref name="other"/>; otherwise, false.</returns>
        /// <param name="other">The collection to compare to the current set. </param>
        /// <exception cref="ArgumentNullException">
        /// <paramref name="other"/> is null.
        /// </exception>
        bool IsProperSupersetOf(IEnumerable<T> other);

        /// <summary>Determines whether a set is a subset of a specified collection.</summary>
        /// <returns>true if the current set is a subset of <paramref name="other"/>; otherwise, false.</returns>
        /// <param name="other">The collection to compare to the current set.</param>
        /// <exception cref="ArgumentNullException">
        /// <paramref name="other"/> is null.
        /// </exception>
        bool IsSubsetOf(IEnumerable<T> other);

        /// <summary>Determines whether the current set is a superset of a specified collection.</summary>
        /// <returns>true if the current set is a superset of <paramref name="other"/>; otherwise, false.</returns>
        /// <param name="other">The collection to compare to the current set.</param>
        /// <exception cref="ArgumentNullException">
        /// <paramref name="other"/> is null.
        /// </exception>
        bool IsSupersetOf(IEnumerable<T> other);

        /// <summary>Determines whether the current set overlaps with the specified collection.</summary>
        /// <returns>true if the current set and <paramref name="other"/> share at least one common element; otherwise, false.</returns>
        /// <param name="other">The collection to compare to the current set.</param>
        /// <exception cref="ArgumentNullException">
        /// <paramref name="other"/> is null.
        /// </exception>
        bool Overlaps(IEnumerable<T> other);

        /// <summary>Determines whether the current set and the specified collection contain the same elements.</summary>
        /// <returns>true if the current set is equal to <paramref name="other"/>; otherwise, false.</returns>
        /// <param name="other">The collection to compare to the current set.</param>
        /// <exception cref="ArgumentNullException">
        /// <paramref name="other"/> is null.
        /// </exception>
        bool SetEquals(IEnumerable<T> other);
    }
}

Just published this contracts with injection helpers to the nuget (https://www.nuget.org/packages/ClrCoder.Collections.ReadOnlySet). Feel free to use it and submit issues here: https://github.com/dmitriyse/ClrCoder/issues If it will becomes a little bit popular (probably after some rounds of refining) we can suggest this improvement to the CoreFX team.

safern commented 7 years ago

@terrajobst is it okay for compat for existing classes to implement new interfaces?

scalablecory commented 7 years ago

@safern there is precedent in List<T> getting IReadOnly added.

IvanPizhenko commented 7 years ago

So is it planned to add in next .NET framework releases?

mishra14 commented 7 years ago

When is this work going to land? Any timelines?

petriashev commented 6 years ago

https://www.nuget.org/packages/System.Collections.Immutable/ Full set of immutable collections )

pgolebiowski commented 6 years ago

OK. This has been here for far too long. I need it very much and would like to propose a way for how to approach this then.

Instead of exposing all of these:

IEqualityComparer<T> Comparer { get; }
bool IsProperSubsetOf(IEnumerable<T> other);
bool IsProperSupersetOf(IEnumerable<T> other);
bool IsSubsetOf(IEnumerable<T> other);
bool IsSupersetOf(IEnumerable<T> other);
bool Overlaps(IEnumerable<T> other);
bool SetEquals(IEnumerable<T> other);

and forcing the customers to implement these members, let's just add what is most crucial to have:

public interface IReadOnlySet<out T> : IReadOnlyCollection<T>
{
    bool Contains<T>(T item);
}

and nothing more. More API can always be added in the future, if there is such a need. But it cannot be removed, thus this proposal.

We would then add this interface to the list of interfaces extended by ISet<T>.

From the documentation

ISet<T>: This interface provides methods for implementing sets, which are collections that have unique elements and specific operations. The HashSet and SortedSet collections implement this interface.

From the code

The ISet<T> interface already has our bool Contains<T>(T item); method defined via ICollection<T>. It also has int Count { get; } via ICollection<T>.

So would be:

public interface ISet<T> : ICollection<T>, IEnumerable<T>, IEnumerable, IReadOnlySet<T>

Trivial change, little to discuss, huge benefit.

Please let's make it happen. Let me know if such a pull request would be accepted and merged. I can create it then.

@karelz @terrajobst @safern @ianhays

TylerBrinkley commented 6 years ago

I found this discussion when working on a real world problem where I wanted to use the key collection of a dictionary for read only set operations. In order to support that case here's the API I propose.

Rationale

Proposed API

 namespace System.Collections.Generic {
+    public interface IReadOnlySet<out T> : IReadOnlyCollection<T>, IEnumerable, IEnumerable<T> {
+        bool Contains(T value);
+        bool IsProperSubsetOf(IEnumerable<T> other);
+        bool IsProperSupersetOf(IEnumerable<T> other);
+        bool IsSubsetOf(IEnumerable<T> other);
+        bool IsSupersetOf(IEnumerable<T> other);
+        bool Overlaps(IEnumerable<T> other);
+        bool SetEquals(IEnumerable<T> other);
+    }
-    public class HashSet<T> : ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T> {
+    public class HashSet<T> : ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T>, IReadOnlySet<T> {
     }
-    public class SortedSet<T> : ICollection, ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T> {
+    public class SortedSet<T> : ICollection, ICollection<T>, IDeserializationCallback, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, ISerializable, ISet<T>, IReadOnlySet<T> {
     }
+    public class ReadOnlySet<T> : ICollection<T>, IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, IReadOnlySet<T>, ISet<T> {
+        public int Count { get; }
+        public ReadOnlySet(ISet<T> set);
+        public bool Contains(T value);
+        public bool IsProperSubsetOf(IEnumerable<T> other);
+        public bool IsProperSupersetOf(IEnumerable<T> other);
+        public bool IsSubsetOf(IEnumerable<T> other);
+        public bool IsSupersetOf(IEnumerable<T> other);
+        public bool Overlaps(IEnumerable<T> other);
+        public bool SetEquals(IEnumerable<T> other);
+    }
     public class Dictionary<TKey, TValue> {
-        public sealed class KeyCollection : ICollection, ICollection<TKey>, IEnumerable, IEnumerable<TKey>, IReadOnlyCollection<TKey> {
+        public sealed class KeyCollection : ICollection, ICollection<TKey>, IEnumerable, IEnumerable<TKey>, IReadOnlyCollection<TKey>, IReadOnlySet<TKey> {
+            public bool IsProperSubsetOf(IEnumerable<TKey> other);
+            public bool IsProperSupersetOf(IEnumerable<TKey> other);
+            public bool IsSubsetOf(IEnumerable<TKey> other);
+            public bool IsSupersetOf(IEnumerable<TKey> other);
+            public bool Overlaps(IEnumerable<TKey> other);
+            public bool SetEquals(IEnumerable<TKey> other);
         }
     }
 }

Open Questions

Updates

safern commented 6 years ago

@TylerBrinkley thanks for adding an API Proposal in the thread. Would you mind adding rationale and use cases? So that I can mark it ready for review and let the api experts decide?

dmitriyse commented 6 years ago

@TylerBrinkley do no forget to include EqualityComparer property in the IReadOnlySet interface. They are currently missed in Dictionaries ans Sets in the CoreFX, but it's an issue.

TylerBrinkley commented 6 years ago

@dmitriyse what use would a getter only IEqualityComparer<T> property be? What would you do with it?

dmitriyse commented 6 years ago

Dictionaries and Sets should report their EqualityComparers to allow correct collection cloning. Unfortunately I forgot where this issue was discussed.

TylerBrinkley commented 6 years ago

If you're doing cloning wouldn't you be working with a concrete type? Why would the interface need to support the IEqualityComparer<T>?

dmitriyse commented 6 years ago

For example if you have set with the strings and case insensitivity equality comparer, you will receive an exception on creating of a new HashSet without specifying correct EqualityComparer. There are cases when you can't know what EqualityComparer is used in the specified set.

aaron-meyers commented 6 years ago

It's not just cloning. I think the far more common scenario is comparing two sets -- I need to know that they both use the same comparer in order to implement an optimal comparison using Contains. I think there's an example in this thread.

That said, I'd rather have IReadOnlySet with just the Contains method than not at all. It would be nice to be able to implement Set comparison generically but not as common as just needing a read-only reference to a Set.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Tyler Brinkley notifications@github.com Sent: Thursday, May 10, 2018 6:21:52 AM To: dotnet/corefx Cc: Aaron Meyers; Mention Subject: Re: [dotnet/corefx] Please add interface IReadOnlySet and make HashSet, SortedSet implement it (#1973)

If you're doing cloning wouldn't you be working with a concrete type? Why would the interface need to support the IEqualityComparer.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fcorefx%2Fissues%2F1973%23issuecomment-388051258&data=02%7C01%7C%7Cc45ea16cd3034ddd69d808d5b678ff33%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636615553141417289&sdata=xRI27JtyaAwnZ2anY05oTlxmPY5AaGVl%2BRdXK2uR0%2F8%3D&reserved=0, or mute the threadhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAMuQLmqboBWyHweWHSUoE1YM2OrfHZZxks5txD7wgaJpZM4E9KK-&data=02%7C01%7C%7Cc45ea16cd3034ddd69d808d5b678ff33%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636615553141417289&sdata=hLtAXEyFNVEgWike6tMwAfUVC%2BucyjXUDwoLOLDV5gk%3D&reserved=0.

jnm2 commented 6 years ago

I agree—the only way you can tell what types of duplicates you might find in the set (case sensitive, case insensitive, etc) is by exposing the comparer.

TylerBrinkley commented 6 years ago

I'm beginning to think that my proposal would not be accepted as adding all those methods to the Dictionary<TKey, TValue>.KeyCollection would have a significant code size cost. See this discussion regarding adding new API to commonly instantiated generic types.