fsharp / fslang-suggestions

The place to make suggestions, discuss and vote on F# language and core library features
345 stars 21 forks source link

Add support for higher-rank types #567

Closed kurtschelfthout closed 2 years ago

kurtschelfthout commented 7 years ago

Add support for higher-rank types

We propose to add support for higher-rank types, originally suggested by @polytypic, see https://twitter.com/VesaKarvonen/status/846239603627569152. This suggestion draws heavily on comments made in that thread by Vesa and @kbattocchi .

The existing way of approaching this problem in F# is to manually create an interface type with a single method to represent an argument with a type like (forall 'a. 'a -> 'a) and anonymously implement the interface on usage, for example http://stackoverflow.com/questions/7213599/generic-higher-order-function/7221589#7221589

Alternatively, that SO question also shows some SRTP/operator overloading magic that can be used to address the problem.

Pros and Cons

The advantages of making this adjustment to F# are that

The disadvantages of making this adjustment to F# are that

Extra information

Estimated cost (XS, S, M, L, XL, XXL): Err. L?

Related suggestions: NOT to be confused with https://github.com/fsharp/fslang-suggestions/issues/175 this is not about higher kind but higher rank (non-prenex) polymorphism.

Affadavit (must be submitted)

Please tick this by placing a cross in the box:

Please tick all that apply:

kbattocchi commented 7 years ago

If this were to be implemented, it would be great to see support for existential types in addition to universal ones, since faithfully representing them using the encoding exists 'x.T<'x>forall 'z.(forall 'x.T<'x>->'z)->'z requires an even greater amount of boilerplate. This would require extra work, so I don't think it's necessary to do it all at once, but it would be good to plan ahead if possible.

dsyme commented 7 years ago

the workaround is very verbose and hard to read. A relatively straightforward representation exists in .NET. Tedious and repetitive code generation is what a compiler is for ;)

I'm not sure that the workaround is that much harder to read than the type-annotated version. And it requires no extra knowledge on the part of the programmer

let g f (x,y) = (f x, f y)

We'll need more substantial examples than this.

Also note the code may be much slower than passing in two functions. That's a major reason for not adding this feature.

kurtschelfthout commented 7 years ago

I'm not sure that the workaround is that much harder to read than the type-annotated version. And it requires no extra knowledge on the part of the programmer

I think the type annotated version would look something like this:

let g (f<'a>:'a->'a) (x:'b,y:'c)  : 'b * 'c = (f x, f y)

let pair = g id (1, '2')

I haven't thought this through much but allowing type arguments on an argument expresses that the 'a is not a type argument of g. And this looks like very similar verbosity to what we kind of have to do with top-level functions already (as a "best practice" for documentation etc).

Compared to:

type PassThrough =
    abstract Invoke<'a> : 'a -> 'a

let g (f:PassThrough) (x,y) = (f.Invoke x, f.Invoke y)

let pair = g { new PassThrough with member __.Invoke(x) = x } (1, '2')

Note the call site (the thing you write over and over again) is the most verbose. I claim it's pretty hard to see here that this is just the id function. (And we can't use existing functions like id directly anyway).

We'll need more substantial examples than this.

For sure, this is the simplest possible example where it comes up.

Also note the code may be much slower than passing in two functions.

Can you explain why? Seems like it's very comparable to passing in a closure and invoking it: one instantiation of the closure type vs one instantiation of the interface implementation + one method call for Invoke vs one method call for Invoke.

dsyme commented 7 years ago

Can you explain why? Seems like it's very comparable to passing in a closure and invoking it: one instantiation of the closure type vs one instantiation of the interface implementation + one method call for Invoke vs one method call for Invoke.

Dispatch to a generic virtual method is generally slower - the original implementation used a hash table lookup - but the implementation details are subtle and it is possible that CLR "fast path" optimizations are good enough. Measure and see though (using the class encoding...)

kbattocchi commented 7 years ago

@dsyme I'm not sure I understand why the performance matters - if someone needs higher rank types then they're going to pay that penalty whether they implement them by hand or if the compiler uses some more pleasant sugar to do it for them. And while it's not a constant occurrence, people do need higher rank types at times, the .NET platform already supports them (in the form of types with generic methods), and implementing them by hand is as awkward as explicitly implementing FSharpFunc<_,_> instead of using lambdas would be.

dsyme commented 7 years ago

@kbattocchi Two examples. For this:

let g (f<'a>:'a->'a) (x:'b,y:'c) : 'b * 'c = (f x, f y)

we have an alternative today, which is to pass in "f" twice, e.g.

let g f1 f2 x y = (f1 x, f2 y)

and then duplicate "f" at the callsite. That doesn't work in general of course. But I do know its performance doesn't use any generic virtual dispatch.

Likewise it matters because people will inevitably use the feature when not needed, e.g. where "f" is only instantiated at one type. So they pay a performance cost for nothing.

kbattocchi commented 7 years ago

@dsyme I doubt that there would be too many people using the feature unnecessarily because type inference still wouldn't infer higher rank types (since there's no single most general type to infer), so it would require the user to opt-in by writing an annotation.

Also, your alternative allows users to pass in any two functions, even if they are not actually different instantiations of a single higher-rank function, which makes it easier for a caller to use the function incorrectly. (Technically since .NET allows runtime type tests of the generic parameter it's always possible to build a higher-rank function with the same behavior, but it would be very unnatural).

Finally, there is some overhead to using a generic interface, but I can still make a tens of millions of calls a second.

kurtschelfthout commented 7 years ago

Few more substantial examples: http://stackoverflow.com/a/24717685 (for limiting the scope of side effects) http://stackoverflow.com/a/34623859 (a connection with visitor pattern is drawn here) http://stackoverflow.com/a/41124608 (Transforming a generic record type by applying a generic function - like sprintf - to several fields in a record type that are of different type)

And here's a post about encoding existential types, tangentially related: https://stackoverflow.com/questions/16284680/expressing-existential-types-in-f

kurtschelfthout commented 7 years ago

I wonder if it would be a better idea if instead of allowing rank-n types, it would instead be allowed to convert functions to single-method interface types, a bit reminiscent of function to delegate conversion.

The simple example would then become:

[<CallableAsFunction>] //not sure this is necessary, strictly speaking
type PassThrough =
    abstract Invoke<'a> : 'a -> 'a

let g (f:PassThrough) (x:'b,y:'c)  : 'b * 'c = (f x, f y)   //under the hood: f.Invoke x

let pair = g id (1, '2') // under the hood: anonymous interface implementation

Pros:

Cons:

robkuz commented 7 years ago

Hey @kurtschelfthout,

great writeup. I'd like to add my perspective to that

On Readability and Comprehension

the workaround is very verbose and hard to read. A relatively straightforward representation exists in .NET. Tedious and repetitive code generation is what a compiler is for ;)

I'm not sure that the workaround is that much harder to read than the type-annotated version. And it requires no extra knowledge on the part of the programme

I would be pretty surprised if any developer would think that a definition like
let g (f<'a>:'a->'a) (x:'b,y:'c) : 'b * 'c = (f x, f y)
is harder to read than the encoding via a type and a member method. Maybe it is for very, very new people out of C#-land.

As for the knowledge part: It took me almost a year before I really grokked that I can (or rather must) express such a function as a wrapped invokable type. For me this was totally counter-intuitive. Somehow like Java before Java 8 and Lambdas.

Even in Haskell I was totally stunned that I have to enable a language extension to make this happen. I think most novices would simply expect that this works. At least that was my expectation. And if this request is ever implemented please don't ever call it HigherRankTypes etc. - its just a polymorphic function as a param. Everybody gets that!

Performance

I'm not sure I understand why the performance matters - if someone needs higher rank types then they're going to pay that penalty whether they implement them by hand or if the compiler uses some more pleasant sugar to do it for them.

I'd like to add: Why nor let the developer decide if there is any performance issue at all? Whoever uses it should first measure if the code under review has any performance issues and then if his or her time is not better spend optimizing some other part of the system. If there is a to high performance penalty to pay then there is always the possibility to defunctionalize (or monomorphize) the code to make it faster. Please leave such decisions to the developer.

And while it's not a constant occurrence, people do need higher rank types at times,

I must admit I use it relatively often - probably one of the aspects that I try to write my code pretty generic and then polymorphic function params come pretty handy.

Alternative

    [<CallableAsFunction>] //not sure this is necessary, strictly speaking
    type PassThrough =
        abstract Invoke<'a> : 'a -> 'a

    let g (f:PassThrough) (x:'b,y:'c)  : 'b * 'c = (f x, f y)   //under the hood: f.Invoke x

    let pair = g id (1, '2') // under the hood: anonymous interface implementation

Concerning the alternative to the first proposal let g (f<'a>:'a->'a) (x:'b,y:'c) : 'b * 'c = (f x, f y) - I strongly suggest against it. Yeah it does make the call site easier but still one must understand why. If that would need to be why not somehow abstract the new type away and simply use type alias (if this was possible)

    [<CallableAsFunction>] 
    type PassThrough<'a> = 'a -> 'a

    let g (f:PassThrough) (x:'b,y:'c)  : 'b * 'c = (f x, f y)   //under the hood: f.Invoke x

    let pair = g id (1, '2') // under the hood: anonymous interface implementation

or why not adorn a function itself

    let g ([<PolymorphicFnParam>] f:'a -> 'a) (x:'b,y:'c)  : 'b * 'c = (f x, f y) 

    let pair = g id (1, '2') // under the hood: anonymous interface implementation

or we could extend this feature request ("Allow for Structural Checking of Generic Parameters within Type Constraints")[https://github.com/fsharp/fslang-suggestions/issues/566] and express it like this

    let g<'f, 'a where 'f: isPolymorphicFun<'a -> 'a> (f:'f) (x:'b,y:'c)  : 'b * 'c = (f x, f y) 

    let pair = g id (1, '2') // under the hood: anonymous interface implementation

I don't know if this last is a good idea thou.

Btw. I'd be willing to help testing or to fund this request

kurtschelfthout commented 7 years ago

If that would need to be why not somehow abstract the new type away and simply use type alias (if this was possible)

Because you really need two places to put parameters (and maybe constraints on): the interface and the interface method. Take the other possibility mentioned in the very simple example:

    type ConvertAll<'b> =
        abstract Invoke<'a> : 'a -> 'b

    let g (f:ConvertAll<_>) (x,y) = (f x, f y)

    let pair = g (fun _ -> 1) ("test", true)

Also I'd expect in quite a few cases that the interface Invoke has some type constraints (e.g. subtype) so that it can do more than either be the identity or constant function. In other words my original syntax suggestion was a bit naive. For example it's not clear how you could use it in type abbreviations, as you can't name the arguments there (I think):

    type MyAbbrev<'b> = (f<'a>:'a -> 'a) -> 'b

Generally you should be able to write a type like this (in pseudo notation):

    forall a. a -> (forall b when b :> IMyInterface. b -> (forall c when c : equality. c -> a)))

everywhere you can currently write types (signature files, type annotations, type abbreviations).

Naming the interface type neatly removes that problem without introducing any additional syntax.

The PolymorphicFnParam attribute suffers from the same problem - which parameter is not universally qualified on the outer level? How do you specify constraints? Etc.

robkuz commented 7 years ago

Thanks for the explanation.

What would happen if we'd assume that alle parameters of the given function are polymorphic and have a universal quantification so that your

forall a. a -> (forall b when b :> IMyInterface. b -> (forall c when c : equality. c -> a)))

could be expressed like this

[<UniversalQuantification>]
let g<'a,'b when 'b :> IMyInterface and 'c when 'c : equality>(f: 'a -> 'b -> 'c ) = ...

would that be a problem?

kurtschelfthout commented 7 years ago

@robkuz

would that be a problem?

Yes. You need to allow the programmer to be in control of how each type parameter is quantified. It's not either all on the outermost level (the current behaviour) or each subsequent parameter in a new scope (your suggestion).

For example, how would you write the ConvertAll example:

forall 'b 'c 'd. (forall 'a. 'a ->'b) -> 'c * 'd -> 'b * 'b
robkuz commented 7 years ago

@kurtschelfthout I am not sure I agree (or understand for that matter ;-))

Let' say we have your converter function (and for a moment disregard any other constraints)

[<Polyporphic>] 
let convert<'a, 'b, 'c, 'd> (x: 'a) (y: 'b) : 'c * 'd = ...

which in then should expand to something like this val convert : forall a b c d. 'a -> 'b -> ('c * 'd)

and lets assume that we had a call site like this

let runConvert ( [<Polyporphic>] (f : 'a -> 'b -> 'c * 'd) (a : 'a) (b : 'b) : 'c * 'd = ...

and lets further assume that the implementation of runConvert would assume a monomorphic 'b and 'd would that be of any problem? Sure in theory yes as we run a fully polymorphic function where only a partially polymorhic function would suffice but in practice?

kurtschelfthout commented 7 years ago

Just to add another use case in TypeShape - the slides on it have a lengthy interlude on rank-2 and existential types: https://eiriktsarpalis.github.io/typeshape/#/18. This plays an important role in how the library works.

For example, instead of writing:

    | Shape.FSharpList s ->
        s.Accept {
            new IFSharpListVisitor<'T -> string> with
                member __.Visit<'a> () =
                    let tp = mkPrinter<'a>()
                    wrap(fun ts -> ts |> List.map tp |> String.concat "; " |> sprintf "[%s]")
        }

Users could write perhaps:

    | Shape.FSharpList s ->
        s.Accept (fun<'a> () -> 
                    let tp = mkPrinter<'a>()
                    wrap(fun ts -> ts |> List.map tp |> String.concat "; " |> sprintf "[%s]")

If you know about existential "packs" they are relatively common. E.g. here is an IGen interface in FsCheck, to pack the generic Gen<'a> type for use in reflection. Just below is a similar kind of interface for Arbitrary<'a>.

dsyme commented 7 years ago

@kurtschelfthout Thanks.

I'm still not entirely convinced that the explicit encoding is a bad thing though - at least it shouts out to me "here's an existential pack/unpack!" (though you have to know the encoding to be able to read it in those terms)

kurtschelfthout commented 7 years ago

(though you have to know the encoding to be able to read it in those terms)

Therein lies the rub, I think. I for one can't understand how the existential encoding works without writing it in a shorter, typed form and then deriving the interface-based encoding from that. I would have trouble recognising what it is when encoded via multiple generic interfaces. Half the slides of TypeShape talk about how/why this encoding is useful and necessary, so I don't think I'm an outlier.

The simple rank 2 encoding with anonymous interface implementation is more easily understandable of course. But then not many language features have acceptable cost/benefit when applied to the simplest possible example.

The main benefit I see with a feature like this is that intent of the code and documentation (of say, a library like TypeShape) is significantly improved. I do agree the feature is kind of niche - expected in F# at this point, most of the "obvious" features are there. On the other hand it doesn't get particularly in the way either, you can practically ignore it until you encounter it in a library signature somewhere.

robkuz commented 6 years ago

This might be relavant for the discussion. Polymorphism in OCaml 4.0.6

Kazark commented 5 years ago

If you have a GADT query or request/response DSL, rank-2 types can be very helpful in reducing meaningless code duplication and boilerplate. As a follow-on to the Haskell GADT example I gave on issue #179, here is an example of how you would use that in another part of your DSL:

    data Step
      = forall a. DoCheck (Check a) (a -> [Step])
      | NotifyDone AlreadyCompleted
      | RunAction Action
      | FailWith StepError

Again, this is in Haskell, rather than inventing a syntax for F#. This particular thing I have not figured out an encoding for in F# that would work for me, but since the ways that I've been able to encode GADTs have been unsatisfying anyway, I use the workaround I describe in my comment on that issue.

Here is, then, how you would use this in your interpreter or runtime for your DSL:

runCheck :: Check a -> IO a
runCheck (FindExe exe) = findExecutable (show exe)
runCheck (ChocoList chocoPath) = do
  raw <- runProcReturnOutput chocoPath ["list", "-lo"]
  return $ parseChocoInstalled raw
runCheck (GitConfigCheck gitPath cmd) = do
  raw <- runProcReturnOutput gitPath ["config", "--get", (gitCfgCmdToCliArg cmd)]
  if raw == ""
  then return Nothing
  else return (Just raw)

As you can see, your overall type signature is generic, but each particular case you match on has the type refined to something more specific.

Swoorup commented 4 years ago

A use-case for higher rank types I just stumbled across. I have the following:

open System
type ValidationResult<'r> = Result<'r, string list>
type SinglePropValidationResult<'property> = Result<'property, string>
type PropValidatorFns<'property> = ('property -> SinglePropValidationResult<'property>) list
type PropMap<'object, 'property> = 'object -> 'property
type ObjValidator<'object, 'property> = (PropMap<'object, 'property> * PropValidatorFns<'property>) list
module Validation =
  let stateIsValid (state: State) =
    match Constants.StatesOfAustralia.Contains(state.Value) with
    | true -> Ok state
    | false -> Error "Given state is not a known US state."
  let pipeObjectThroughValidation
    (validators: ObjValidator<'object, forall. 'property>)
    (input: 'object): ValidationResult<'object> =
    let errorList =
      validators
      |> List.collect
           (fun (mapper, validatorList) ->
              let property = mapper input
              validatorList |> List.map (fun v -> v property))
      |> List.choose (function
           | Error msg -> Some msg
           | Ok _ -> None)
    if not errorList.IsEmpty then
      Error errorList
    else
      Ok input

The pipeObjectThroughValidation which takes an object as the input, and list of pair of mapping function (to map each property of the input object) and PropValidatorFns (to invoke validation on each of the properties).

Here, the property could be of different type in an object (records, tuples, or anything). And would like to use it as

let person = { firstname = "jake"; age = 1000 } 
pipeObjectThroughValidation [ (_.firstname, [validateName; validateSthElse]); (_.age, validateAgeRequireMents)]
gusty commented 3 years ago

Using the interface trick prevents our function to use F# constraints.

Take the following example:

type Convert<'s> =
    abstract Invoke<'t,'s> : 't -> 's

let g (f: Convert<_>) (x: string, y: float) = (f.Invoke x, f.Invoke y)

this works if we pass a function like string

g { new Convert<_> with member __.Invoke(x) = string x } ("2", 1.)

but if we pass something like int it breaks

g { new Convert<_> with member __.Invoke(x) = int x } ("2", 1.)
// ~vs4DEC.fsx(6,51): error FS0001: The declared type parameter '?' cannot be used here since the type parameter cannot be resolved at compile time

And we can't add the static constraint required for int to the interface, however if we could it would type check. Let's add another constraint that it's allowed.

type Convert<'s> =
    abstract Invoke<'t,'s> : 't -> 's when 't : null

Now g stops compiling because float doesn't support : null, changing g to use something like float [] type checks.

let g (f: Convert<_>) (x: string, y: float []) = (f.Invoke x, f.Invoke y)

and the constraint disappears as it is resolved at compile time.

If we use the trick of the static Invoker we don't have that limitation but it generates lot of constraints

type ToString = ToString with
    static member ($) (ToString, x) = string x

type ToInt = ToInt with
    static member inline ($) (ToInt, x) = int x

let inline g f (x: string, y: float) = (f $ x, f $ y)

g ToString ("2", 1.)
g ToInt ("2", 1.)

because it delays overload resolution, forcing our g function to be declared inline, but with a proper mechanism this shouldn't be needed.

So, none of the 2 existing workarounds are good enough. We do need to add this functionality, there's no other way.

dsyme commented 2 years ago

Higher-rank types of this kind won't be added to F#. The encoding using a generic interface method is adequate for nearly all use-cases, and superior from some perspectives (e.g. can be more explicit and readable).

gusty commented 2 years ago

If the decision is to keep the existing encodings at least if we had #1085 we could turn this:

let g (f:PassThrough) (x,y) = (f.Invoke x, f.Invoke y)

into this

let g (f:PassThrough) (x,y) = (f x, f y)

And also

let inline g f (x: string, y: float) = (f $ x, f $ y)

into

let inline g f (x: string, y: float) = (f x, f y)

If the invokable as static member is also implemented.