Allow records/DUs to have different visibility constructors to their fields

TobyShaw commented 4 years ago

I propose we opt-in to restricting construction of records/DUs to a given scope (via private/internal).

When you encounter the problem of trying to maintain an invariant of a type (which cannot be expressed in the type system) you often resort to writing smart/safe constructors.

In F#, for records and DUs, this sadly results in hiding the entire internals of the type. This is because the constructor cannot be separated from the rest of the internals in terms of visibility. Given how desirable it is to use records and DUs in F# (because of the syntactic sugar afforded to them), it is sad we can't use them as intended.

The best we have currently is to re-expose the fields through members as shown:

type SafeFilePath =
    internal
        {
            DirectoryName_ : string
            FileName_ : string
        }
    static member Create (path : string) : Result<SafeFilePath, string> =
         failwith "Insert parsing/error checking logic here".

    member this.DirectoryName  : int = this.DirectoryName_
    member this.FileName : string = this.FileName_

But this approach has multiple downsides:

You lose the ability to pattern match on the record.
You lose the ability to derive an anonymous record from your record.
You have boilerplate in the definition of the type.
Consumers now have a method call to retrieve a field from the record, instead of a field load. This has performance implications.

Similarly with Discriminated Unions:

type MyDU =
    internal
    | X_ of int
    | Y_ of string

[<AutoOpen>]
module MyDU =
    let (|X|Y|) = function | X_ i -> X i | Y_ i -> Y i

This is more usable from the perspective of a consumer, but it also has downsides:

Active patterns compile down to allocating Choice objects, and are therefore much less performant than regular pattern matching.
It's boilerplate.

I propose we could replace this with:

type MyRecord internal () =
    {
        X : int
        Y : string
    }

or

type MyRecord internal new =
    {
        X : int
        Y : string
    }

or

type MyRecord =
    {
        X : int
        Y : string
    }
    internal new

The syntax for DUs would be identical, and would apply the visibility to all cases.

Pros and Cons

The advantages of making this adjustment to F# are:

Less boilerplate
Better performance
Nicer syntactic sugar (for records mainly)

The disadvantages of making this adjustment to F# are:

More syntax

Extra information

Estimated cost (XS, S, M, L, XL, XXL): S

Affidavit (please submit!)

Please tick this by placing a cross in the box:

[x] This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
[x] I have searched both open and closed suggestions on this site and believe this is not a duplicate
[x] This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

[x] This is not a breaking change to the F# language design
[x] I or my company would be willing to help implement and/or test this

Happypig375 commented 4 years ago

Duplicate of: https://github.com/fsharp/fslang-suggestions/issues/161 (Discussion) https://github.com/fsharp/fslang-suggestions/issues/205 (Discussion)

Smaug123 commented 4 years ago

Very much in favour of this, on first read-through. Your final suggestion seems nicest to me (internal new after the record syntax).

TobyShaw commented 4 years ago

Duplicate of:

161 (Discussion)

205 (Discussion)

These only covered DU constructors, rather than records as well (which I make the case of having more more benefits from this change). One might argue that since this changed was declined for DUs, I should remove them from this proposal (and stick only with records), but I would say that the consistency between the two is important.

Happypig375 commented 4 years ago

From #205's discussion:

Comment by Don Syme on 2/5/2016 8:49:00 AM

Thanks for the suggestion. Enforcing invariants on union and record types is a serious issue. The recommended way is either to use a class type, or to hide the representation. This is an interesting midpoint, which is to intercept the construction methods for the union type. It's more difficult to design a corresponding way to do this for a record type. However, in the balance this feels too adhoc to embrace in the F# language design: would we do the same for Is? for accessors? Would we make union types purely pattern-based? Just doing the New case seems like scratching the surface and opening a box with more inside it. I will mark this as declined, though it's definitely an interesting idea and something I had not thought of doing.

TobyShaw commented 4 years ago

While Don acknowledges the utility of doing it for records, he does not argue strongly against it. Additionally, this is from 4 years ago, and doesn't seem like the strongest rejection - "I will mark this as declined, though it's definitely an interesting idea".

I would love to hear your own opinions on the proposal, though.

Happypig375 commented 4 years ago

My own opinion is to support this, though this only has little chance of not being rejected.

Tarmil commented 4 years ago

I like this. I think the third proposed syntax, with internal new listed like a member, is the best. Putting constructor information between the type name and the equal sign should remain an indication that this is a class rather than a union or record, IMO.

charlesroddie commented 4 years ago

This suggestion is a more polished version of Allow private constructors on DU cases which adds records, and this is indeed consistent. So this thread is preferable.

Smart constructors for DU cases is more complex and speculative, requesting that alternative constructors be required by the compiler to replace private ones. I think we can discount that one.

But there is another active and open thread https://github.com/fsharp/fslang-suggestions/issues/810 and there is too much overlap to keep both alive. Since that one is first it may be best to rebase and merge this suggestion onto that one.

kspeakman commented 4 years ago

My alternative strategy is to create a shareable validation function for the type. And always validate it before executing logical workflows using it. That way anyone can construct a value of that type. And anyone can also validate the data to prevent getting an error.

It can also allow you to have temporarily bad data. For example, user is filling out a form and has not yet entered a required field. I can still keep a "bad" unvalidated version of the data to receive their input, but also see the validated version of it results in errors. Which I can use to block submission and display error/required notices until they finish.

I usually need such validation functions anyway for user input, and incoming API requests (because wire formats have limited typing capabilities). So why not?

FYI this also is how some lisps implement optional typing.

Tarmil commented 4 years ago

@kspeakman I still prefer to have a type for which I can be certain that if I have a value of this type, then it is correct. But I hear you about the need for an unvalidated version to interact with forms (or other external systems like databases or serialization). With this suggestion, you could do something like this:

type UserData =
    { name: string
      age: int }

type [<Struct>] User =
    { data: UserData }
    private new

    static member validate (data: UserData) : User option =
        if String.IsNullOrWhiteSpace data.name then None
        elif age < 0 then None
        else Some { data = data }

Forms would use UserData, which is unvalidated. The rest of the code would use User, which can only be constructed via validate but still gives read access to fields as u.data.name. Or if the validation is more complex, User can have concrete fields (like name, age) instead of just storing a UserData.

kspeakman commented 4 years ago

My response was for anyone arriving here looking to solve the problem with what we have today. And really, you can't be as certain with a private constructor because it can be invoked by reflection (by parsers or determined devs). Whereas if my function calls a validation function and then only proceed on its Ok branch, that is a bit harder to bypass. And I have increased my confidence level in the data in the same place that I want to use the data.

But obviously different approaches are preferred by different people. So until this issue's request gets implemented, that is the most straight-forward alternative I have tried.

Happypig375 commented 4 years ago

@kspeakman With reflection, you automatically lose type safety. You can even construct decimals with invalid format, yet you don't see any framework functions validating the decimals they receive. :wink:

kspeakman commented 4 years ago

For example, Newtonsoft will happily construct a record with all null/default values. Even though this is not idiomatic F# (or even valid as declared F#), it is allowed by CLR and generated using reflection.

Happypig375 commented 4 years ago

@kspeakman That behaviour is opt-in, by using [<CLIMutable>]. By using this attribute, you are already acknowledging the potential loss of type safety. This does not apply to regular record types.

kspeakman commented 4 years ago

CLIMutable doesn't matter (and I don't use it). Newtonsoft will find and use the record constructor of "regular record types". And in absence of values in the JSON it will use default ones. Which is null for non-struct/non-primitive types. But that's just one common example of a library using reflection to find and invoke constructors.

Probably your co-workers won't go to the trouble to use reflection to invoke a constructor (would be considered "hacky" code at best, malicious at worst... but yes you could get a typed value out of it). Either way a separate validation function that you call before using the data would catch them both.

dsyme commented 4 years ago

As a note, this was actually in F# 1.x (for both union cases and record fields) but we ripped it out as the implementation was buggy.

IIRC the problems were

incompleteness checking was buggy for pattern matching over union cases - we were consdiering the check complete if all the visible union cases were being matched
likewise, record construction syntax was being allowed with an incomplete set of fields (only the ones actually visible)
likewise { .. with ... } was buggy in some dubious way
you have to be careful to note that generated equality and comparison are w.r.t. internal fields

I think that was all. There may even be relics of this left in the codebase e.g. parser.

Luke Hoban also argued against the feature on design grounds I think - or maybe it was me arguing against - I'm not sure. There's a feeling that the "keep it simple" approach for records and unions is just really effective in F# for landing the features in a sensible place while keeping transition to/from object-programming code possible.

That said I did think the feature reasonable in F# 1.x, we just decided to cut it since it was buggy. So it doesn't quite fall victim to the "it's been considred before" rule but it is a factor we should consider.

Tarmil commented 4 years ago

Presumably these problems are related to having some union cases, or some record fields, private? If we simply allow making construction private regardless of case/fields, then these problems shouldn't arise. Pattern matching would still be complete, with would only be allowed where you have visibility of the constructor, and so on.

charlesroddie commented 4 years ago

@Tarmil is right about with, and that you would not have some record fields private rather the whole constructor. Having some DU cases private wouldn't be a problem.

The only remaining problem from @dsyme 's post is

incompleteness checking was buggy for pattern matching over union cases

That would be the work needed to implement this suggestion: union cases and record fields should be public for the purpose of pattern matching even when the constructors are private.

voronoipotato commented 2 years ago

for anyone curious there is a fssnip that accomplishes the "all private cases, but able to pattern match' by iceypoi

http://www.fssnip.net/ma/title/Alternate-version-of-Private-DU-constructor

voronoipotato commented 12 hours ago

I came here looking for this after reading https://norikitech.com/posts/functional-affirmations/ only to find I was the last person to comment T_T.

fsharp / fslang-suggestions