Open jaredpar opened 2 years ago
Tagging subscribers to this area: @dotnet/area-system-collections See info in area-owners.md if you want to be subscribed.
Author: | jaredpar |
---|---|
Assignees: | - |
Labels: | `area-System.Collections` |
Milestone: | - |
Certainly seems like a bug. The semantics of this comparer are suspect to begin with (it's always using EqualityComparer<T>.Default
), but if the goal is to determine whether a set created from one using the default comparer is equal to a set created from the other using the default comparer, then it makes sense that this path doesn't just check the count of each collection; it's feasible you could have two collections that produce the same set using the default comparer but that have different counts, e.g.
using System.Diagnostics.CodeAnalysis;
var set1 = new HashSet<Person>()
{
new Person("Some", "One"),
};
var set2 = new HashSet<Person>(new BothNamesComparer())
{
new Person("Some", "One"),
new Person("Some", "Other"),
};
Console.WriteLine(set1.Count);
Console.WriteLine(set2.Count);
Console.WriteLine(new HashSet<Person>(set1).Count);
Console.WriteLine(new HashSet<Person>(set2).Count);
class Person
{
public string FirstName, LastName;
public Person(string firstName, string lastName)
{
FirstName = firstName;
LastName = lastName;
}
public override bool Equals(object? obj) => obj is Person p && p.FirstName == FirstName;
public override int GetHashCode() => FirstName.GetHashCode();
}
class BothNamesComparer : IEqualityComparer<Person>
{
public bool Equals(Person? x, Person? y) => x?.FirstName == y?.FirstName && x?.LastName == y?.LastName;
public int GetHashCode([DisallowNull] Person obj) => HashCode.Combine(obj.FirstName, obj.LastName);
}
But the way it compensates for that is, as you say, asymmetrical. It's iterating through one set validating that each item exists (under the default comparer) in the other set, but it's not doing the inverse, which means the second set could contain additional items not in the first set.
Related to #18931. There were attempts made in the past to improve the implementation but the risk of breaking folks depending on the current behavior probably outweighs any potential benefits.
To be clear: I don't have a pressing need for this to be fixed. It came up as a part of a code review and the behavior just seemed off to me and wanted to make sure it was tracked.
My 2 cents: if the risk of breaking change is high then I'd agree with the sentiment around obsoleting the behavior.
Description
The implementation of
HashSetEqualityComparer<T>.Equals
does not consider theCount
of collections when they have different comparers. That means it's possible for two sets to compare equals when they have a different count of items. That leads to the implementation not being symmetric.Reproduction Steps
Sharplab demonstration
Expected behavior
The expected behavior is to print
Actual behavior
This code prints:
Regression?
No
Known Workarounds
No response
Configuration
net6.0
Other information
No response