dotnet / fsharp

The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
https://dotnet.microsoft.com/languages/fsharp
MIT License
3.92k stars 786 forks source link

Seq.except xml doc description / tooltip info can be improved #14251

Open abelbraaksma opened 2 years ago

abelbraaksma commented 2 years ago

For one reason or another, I always thought that Seq.except takes a sequence and removes all elements from it that are in the itemsToExclude sequence. The tooltip says this:

Returns a new sequence with the distinct elements of the second sequence which do not appear in the first sequence, using generic hash and equality comparisons to compare values.

This isn't wrong, but it is fairly easy to read over the "with the distinct elements of the second sequence" bit. This function actually does a set-except operation. Basically, it makes the sequence distinct (i.e., drops all elements that are hashset-equal to another element), plus it removes those elements that are in the itemsToExclude sequence.

> [1;2;2;3;1] |> Seq.except [2];;
val it: seq<int> = seq [1; 3]

> [1;2;2;3;1] |> Seq.except Seq.empty;;
val it: seq<int> = seq [1; 2; 3]

Since sequences are not sets, this was a bit surprising to me when I naively started implementing this same operation in TaskSeq. Just because it is not a set, something like exceptDistinct seems to make more sense (also, it seems like a "remove all elements that are in sequence X from sequence Y" does not exist for seq).

Anyway, not suggesting a change or wanting to add a new function. Here's an attempt at a better wording, but other suggestions welcome before I'd open a PR to do so.

Suggestion:

Does a set-except operation on both sequences, returning a new sequence based on 
<paramref name="source" /> after removing duplicates and any element that also appears 
in <paramref name="itemsToExclude" />, using generic hash and equality comparisons to 
compare values.

We should probably also update the example given in the docs to make this particular behavior clear.

brianrourkeboll commented 2 months ago

For what it's worth, System.Linq.Enumerable.Except behaves the same way, and its summary says simply:

Produces the set difference of two sequences.

The remarks section adds:

The set difference of two sets is defined as the members of the first set that don't appear in the second set.

This method returns those elements in first that don't appear in second. It doesn't return those elements in second that don't appear in first. Only unique elements are returned.