matthewcrews commented 2 years ago

Add overload for creating Set/Map that includes an IComparer<'T>

I propose we add an overload for the creation of Set<'T> and Map<'K, 'V> which takes an IComparer<'T> to override the default comparison behavior of these two collections. Where I believe this would add the most value is in the case of structs that currently get boxed during comparison leading to performance degradation.

The existing way of approaching this problem in F# is to use the built-in comparison functionality in the F# language.

You can see from the benchmarks in this repo. The Struct and CustomStruct versions are allocating more memory than the default Record.

Method	Mean	Error	StdDev	Gen 0	Allocated
Record	2.464 us	0.0479 us	0.0774 us	0.4044	3 KB
StructRecord	2.508 us	0.0500 us	0.0748 us	0.7248	6 KB
CustomStructRecord	3.798 us	0.0758 us	0.0811 us	0.8926	7 KB

If this proposal does not address the GC/performance problem then I suggest we discard it since it is the main concern.

Pros and Cons

The advantages of making this adjustment to F# are that we reduce the GC associated with boxing structs unnecessarily.

The disadvantages of making this adjustment to F# are added complexity of the F# Core library and therefore more code to maintain.

Extra information

Estimated cost (XS, S, M, L, XL, XXL): M

Related suggestions: (put links to related suggestions here)

Affidavit (please submit!)

Please tick this by placing a cross in the box:

[x] This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
[x] I have searched both open and closed suggestions on this site and believe this is not a duplicate
[x] This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

[x] This is not a breaking change to the F# language design
[x] I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the :+1: emoji on this issue. These counts are used to generally order the suggestions by engagement.

charlesroddie commented 2 years ago

F# has had a lot of trouble doing equality and comparison cleanly. PRs have come and gone including https://github.com/dotnet/fsharp/pull/5112, https://github.com/dotnet/fsharp/pull/6175.

If https://github.com/dotnet/fsharp/pull/6175 gets resurrected it may be the best approach to the performance issue.

Fixing the type issue (moving to generic IComparable<T> instead of IComparable) is this suggestion: https://github.com/fsharp/fslang-suggestions/issues/816 .

This suggestion here is a workaround that uses an IComparer<'T>, not because a different comparison logic is needed, but to avoid the poor performance of F# comparison when creating sets. However it is too small. We would need similar workarounds for all uses of comparison in F#. Better to address the problem entirely and then no workarounds are needed.

matthewcrews commented 2 years ago

I'm completely fine with closing this one and resurrecting another. I created this per Don's request on Twitter.

If something could be done, it would be greatly appreciated.

It looks like dotnet/fsharp#6175 is the best approach for this problem. I would be immensely grateful if it was brought back.

cartermp commented 2 years ago

That one is the best overall approach, yeah. Maybe @tihan could chime in on it a bit, but my understanding is that it is very tricky work with a high risk of breaking changes, which is why it was put on ice

chkn commented 2 years ago

This suggestion here is a workaround that uses an IComparer<'T>, not because a different comparison logic is needed, but to avoid the poor performance of F# comparison when creating sets.

Well what if different comparison logic is needed? I had this case recently when trying to use System.Type as a key in a Map. Furthermore, this approach seems much simpler and safer than https://github.com/dotnet/fsharp/pull/6175, but also doesn't preclude that change from landing one day.

Tarmil commented 2 years ago

Here's a problem though: how do you handle, for example, Set.union s1 s2 when s1 and s2 use different comparers?

charlesroddie commented 2 years ago

@cartermp my understanding is that it is very tricky work with a high risk of breaking changes

I am very enthusiastic about the ability of flags to allow F# to move forwards with less concern for breaking changes. Reflection-free string functions will live behind a flag or attribute, and in https://github.com/dotnet/fsharp/pull/6175 the dangerous stuff lives behind --optimize:equality.

We may need an RFC to specify exactly what flags we need. Something like --equalitycomparison [classic|optimized|generic], where classic preserves existing behaviour, optimized (--optimize:equality as defined in the PR) and generic uses IEquatable<'T>.Equals('T) always.

BashkaMen commented 1 year ago

Here's a problem though: how do you handle, for example, Set.union s1 s2 when s1 and s2 use different comparers?

we get a new type Set<Record, CustomRecordComparer> != Set<Record>

Tarmil commented 1 year ago

@BashkaMen We'd have to duplicate the entire Set module though.

fsharp / fslang-suggestions