Closed Mr0N closed 3 days ago
Tagging subscribers to this area: @dotnet/area-system-linq See info in area-owners.md if you want to be subscribed.
nit: Except
(along with Union
and Intersect
) are terms that come from set math, so any method wouldn't/shouldn't be named this way.
For the rare cases where something like this is needed, most people can probably use the following extension method:
IEnumerable<TSource> WhereNotIn<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second)
{
var set = new HashSet<TSource>(second);
return first.Where(e => !set.Contains(e));
}
Duplicate of #107338, https://github.com/dotnet/runtime/issues/75066, and https://github.com/dotnet/runtime/issues/102112.
@eiriktsarpalis - This isn't complaining about the existing behavior, so not a strict duplicate?
I’m not suggesting changing the logic of the existing method, but rather adding an overload where you can choose whether to remove duplicates. For example, it could be implemented like this:
public static IEnumerable<TSource> Except<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second, bool removeDuplicate = true)
`
Adding boolean flags that fundamentally change the semantics of a function isn't considered good practice, and is definitely not something we've done before in LINQ. For better or for worse Intersect
and Except
reflect the semantics of SQL, and it would be more appropriate to consider separate method groups for that functionality. @Clockwork-Muse's use of WhereIn
and WhereNotIn
seems more appropriate in that regard, but I think these would need to be filed as a distinct proposal.
Adding boolean flags that fundamentally change the semantics of a function isn't considered good practice, and is definitely not something we've done before in LINQ. For better or for worse
Intersect
andExcept
reflect the semantics of SQL, and it would be more appropriate to consider separate method groups for that functionality. @Clockwork-Muse's use ofWhereIn
andWhereNotIn
seems more appropriate in that regard, but I think these would need to be filed as a distinct proposal.
Well, yes, the SQL method is more understandable, but for C#, it's not immediately clear how to use this method. In SQL, you can use a LEFT JOIN to quickly find the difference between two tables, but in C#, it's quite difficult to achieve, especially in terms of performance.
It would be great if there were some method that could work between two collections and leverage all possible optimizations to speed up the process, similar to what SQL offers.
Without losing extra data
but in C#, it's quite difficult to achieve, especially in terms of performance
It would be great if there were some method that could work between two collections and leverage all possible optimizations to speed up the process, similar to what SQL offers.
This is mostly because RDBMSs put a lot more work into their dynamic optimizers than would be reasonable for C#. If you're dealing with a large enough dataset to actually affect program performance, stick it into an actual database (especially because chances are you're going to want to do more things with it).
In SQL, you can use a LEFT JOIN to quickly find the difference between two tables
... I really wish (the iSeries version of) DB2's EXCEPTION JOIN
would be added to the standard....
It would be good to implement such an algorithm in C#, and then use it to separate one collection from another. It seems like a complex algorithm at first glance.
https://en.wikipedia.org/wiki/Sort-merge_join
The Merge Join is an algorithm used for efficiently executing a JOIN operation, especially when both tables are already sorted by the column(s) involved in the join. Its key advantage lies in its linear traversal of rows, making it highly performant for large, sorted datasets.
Table A:
ID | Name -- | -- 1 | John 2 | Alice 3 | BobThis behavior is crucial for accurate join results but may increase the size of the output significantly.
=
). For conditions like >
, <
, or !=
, other join algorithms are needed.
Background and motivation
I propose adding an overload to the Except and ExceptBy methods that allows for removing elements from the provided array without removing duplicates.
API Proposal
API Usage
Array:2,3,4,5 Array:2,3,4,5 Array:2,3,4,5,2,5
Alternative Designs
No response
Risks
No response