Erased type-tagged anonymous union types

alfonsogarciacaro commented 7 years ago

Modified Proposal (by @dsyme)

This is a suggestion to add adhoc structural type-tagged union types where

type syntax (A|B) or Typed(A|B) or Typed<A,B>
each type of (A|B|C|...) is distinct and non-overlapping w.r.t. runtime type tests
such types are erased to object
introducing such a union value would need to be either explicit (e.g. using some new operator like Typed) or type-directed or both

See https://github.com/fsharp/fslang-suggestions/issues/538 for original proposal. e.g. this would allow some or all of these:

let generateValue1 () : Typed(int | string)) = if Monday then 2 else "4"

let generateValue2 () = if Monday then Typed 2 else Typed "4"

let eliminateValue (x : Typed(int | string)) = ...
    match x with 
    | :? int as i ->  ...
    | :? string as s -> ...

let eliminateValue2 x = ...
    match x with 
    | Typed(i : int) ->  ...
    | Typed(s; string) -> ...

type Allowed =  Typed(int | string)

There are plenty of questions about such a design (e.g. can you eliminate "some but not all" of the cases in a match? Is column-polymorphism supported?). However putting those aside, such a construct already has utility in the context of Fable, since it corresponds pretty closely to Typescript unions and how JS treats values. It also has lots of applications in F# programming, especially if the use of Typed can be inferred in many cases.

Now, this construct automatically gives a structural union message type , e.g.

type MsgA = MsgA of int * int
let update1 () = Typed (MsgA (3,4))

type MsgB = MsgB of int * int
let update2 () = Typed (MsgB (3,4))

val update1 : unit -> Typed MsgA  // actually : unit -> Typed (MsgA | ..), i.e. column-generic on use
val update2 : unit -> Typed MsgB // actually : unit -> Typed (MsgB | ..), i.e. column-generic on use

and a combination of update1 and update2 would give

let update = combine update1 update2 

val update : unit -> Typed (MsgA | MsgB)

As noted in the comments, some notion of column-generics would likely be needed, at least introduced implicitly at use-sites.

Original Proposal (@alfonsogarciacaro)

I propose we add erased union types as an F# first citizen. The erased union types already exist in Fable to emulate Typescript (non-labeled) union types:

http://fable.io/docs/interacting.html#Erase-attribute

Note that Fable allows you to define your custom erased union types, but this is because it's painful to type a generic one like U2.Case1. If the compiler omits the need to prefix the argument, this wouldn't be necessary and using a generic type can be the easiest solution.

The F# compiler could convert the following code:

// The name ErasedUnion is tentative
// The compiler should check the generic args are different
let foo(arg: ErasedUnion<string, int>) =
    match arg with
    | ErasedUnion.Case1 s -> s.Length
    | ErasedUnion.Case2 i -> i

// No need to instantiate ErasedUnion, but the compiler checks the type
foo "hola"
foo 5
// This doesn't compile
foo 5.

Into something like:

let foo(arg: obj) =
   match arg with
   | :? string as s -> s.Length
   | :? int as i -> i
   | _ -> invalidArg "arg" "Unexpected type"

Pros: It will make the Fable bindings generated from Typescript declaration files much more pleasant to work with.
Cons: It's a feature that seems to be exclusively dedicated to interact with a dynamic language like JS.
Estimated cost (XS, S, M, L, XL, XXL): S

Alternatives

For Fable it's been suggested to generate overloads in the type bindings instead of using erased union types:

interface IFoo {
    foo(arg: string | number): void;
}

type IFoo =
    abstract foo: arg: string -> unit
    abstract foo: arg: number -> unit

However these has some problems:

It can quickly explode when you have several erased union arguments
Due to type inference the F# compiler many times doesn't know which overload to use
Cannot be used in properties
Doesn't let you use erased unions yourself.

Affadavit

Please tick this by placing a cross in the box:

[x] This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
[x] I have searched both open and closed suggestions on this site and believe this is not a duplicate
[x] This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

[x] This is not a breaking change to the F# language design
[x] I would be willing to help implement and/or test this
[ ] I or my company would be willing to help crowdfund F# Software Foundation members to work on this

AviAvni commented 7 years ago

This is sound like it need to be implemented with CompilationRepresentationAttribute Like the way the option.None represents as null so you can defined your erased union

Horusiath commented 7 years ago

@alfonsogarciacaro erased unions are not only a thing for dynamic lang transpilation. They are also useful in message-based systems i.e. when you want to describe protocols in as a closed set of messages (in case of F# those could be discriminated unions). In that case a behavior that wants to satisfy more than one protocol, must have some way to define union of those, which so far is possible only as a lowest common denominator (usually an obj type).

dsyme commented 7 years ago

There are some other interesting reasons for this compiler feature. One is that we frequently hit situations in the F# compiler where a union type incurs an entire extra level in allocations, e.g.

type NameResolutionItem = 
    | Value of ValRef
    | UnionCase of UnionCaseRef
    | Entity of EntityRef
    | ...

The needs for this type are relatively "low perf" (cost of discrimination doesn't really matter - multiple type switches are ok) but the type gets many, many long-lived allocations when the F# compiler is hosted in the IDE. One could make the type a struct wrapping an obj reference manually, but simply adding an annotation to represent this as an erased union type and discriminate by type switching would be a much less intrusive code change. (Note using a struct union would not work well as the struct would still have a dsicrimination tag integer, and would have one field for each union case - struct unions are by no means perfect representations for multi-case types as things stand at the moment)

Estimated cost (XS, S, M, L, XL, XXL): S

:) There's not really any such thing as "S" for language features :) I'd say "M" or "L".

... CompilationRepresentationAttribute ...

yes that would seem natural

robkuz commented 7 years ago

Will there be multiple ErasedUnion s under this proposal? Like ErasedUnion3, ErasedUnion4 etc.?

AviAvni commented 7 years ago

@robkuz if the implementation will be with CompilationRepresentationAttribute then you can create your own erased union

[<CompilationRepresentationAttribute(CompilationRepresentationFlags.ErasedUnion)>]
type DU<'a, 'b> = A of 'a | B of 'b

alfonsogarciacaro commented 7 years ago

@AviAvni @dsyme Please note that if this just enables a CompilationRepresentationFlags.ErasedUnion on customly defined unions and doesn't allow implicit conversions when passing arguments (in the example above writing foo "hola" instead of foo (ErasedUnion.Case1 "hola")), there won't be much benefit for Fable, as this is basically the same situation as we have now.

ijsgaus commented 7 years ago

This is almost same as or on type operator.T1 or T2. In perspective can be realized by special attribute on function. Full erased from compiled code. But how to save metadata?

dsyme commented 7 years ago

But how to save metadata?

I think the intent is that the types would be erased (like other F# information). The metadata would only available at compile-time through the extra blob of F#-specific metadata that F# uses

cartermp commented 7 years ago

I think that given the reasoning above (both for FABLE and the use case @Horusiath mentioned), this would be a good addition. 👍

Richiban commented 7 years ago

Is it very important that the type is erased?

Perhaps it's a slightly separate proposal, but I would love to have ad-hoc type unions in the form:

let print (item : string | int) = 
    match item with
    |  s : string -> printfn "We have a string: %s" s
    |  i : int ->   printfn "We have an int: %i" i

Which would essentially compile down to the same IL as:

let print (item : Choice<string, int>) = 
    match item with
    | Choice1Of2 s -> printfn "We have a string: %s" s
    | Choice2Of2 i -> printfn "We have an int: %i" i

and, more importantly, at the callsite:

print "Hello world"

instead of:

print (Choice1Of2 "Hello world")

alfonsogarciacaro commented 7 years ago

In Fable we've finally managed to remove the erased union case name by using the so-called erased/implicit cast operator !^. Check this and this. So now it's possible to do:

let foo(arg: U2<string, int>) =
    match arg with
    | U2.Case1 s -> s.Length
    | U2.Case2 i -> i

// No need to write foo(U2.Case1 "hola")
foo !^"hola"
foo !^5
// The argument is still type checked. This doesn't compile
foo !^5.

ovatsus commented 6 years ago

TypeScript also supports string literals in these union types, i.e, in addition to type T1 = number | string, it also supports type T1 = number | "string1" | "string2". Would be nice to also support that.

Or alternatively, if string enums were supported like in TypeScript, we could acheive the same effect that way:

    enum Colors { Red = "RED", Green = "GREEN", Blue = "BLUE" }
    type T = number | Colors

alfonsogarciacaro commented 6 years ago

As a reference, Fable already supports string enums :smile:

cloudRoutine commented 6 years ago

@Richiban is this what you're looking for? - Polymorphic Variants

dsyme commented 6 years ago

@Richiban @alfonsogarciacaro I hijacked this suggestion to convert this to a suggestion for erased ad-hoc type unions of the kind suggested by @Richiban

(Note sure what the callsite would be though @Richiban - perhaps what you say)./

ijsgaus commented 6 years ago

Can we make this types not erased? Why not introduce base implementation on Typed<'t1, 't2, ...> and make this as member of FSharp.Core

Richiban commented 6 years ago

@ijsgaus But if it's not erased then it's no difference from Choice<'a,' b>

wallymathieu commented 6 years ago

This seems like a really sweet suggestion! I imagine it could help the performance of a lot of library code.

voronoipotato commented 5 years ago

Would this help this problem?

type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose | Cardinal | Mallard
let x  = Goose 7

This code fails. Goose in the Bird DU shadows Goose as a type and turns it into an Atom. This shadowing happens silently and at least to me is surprising.

type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose of Goose | Cardinal of Cardinal | Mallard of Mallard
let x  = Goose 7

The type shadowing here still means I can't move forward, because there's no way to make a Goose....

type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose' of Goose | Cardinal' of Cardinal | Mallard' of Mallard
let x  = Goose' (Goose 7)

This works. This kind of situation happens where someone created a single case DU, and it gets consumed by someone who can't muck with the original DU for fear of breaking existing code.

BillHally commented 5 years ago

type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose of Goose | Cardinal of Cardinal | Mallard of Mallard
let x  = Goose 7

The type shadowing here still means I can't move forward, because there's no way to make a Goose....

You can make an instance of the Goose type by specifying the type as well as the case:

let x = Goose.Goose 7 // This works

When wrapped in a module, you can still access everything, but there is weirdness:

module Birds =
    type Goose = Goose of int
    type Cardinal = Cardinal of int
    type Mallard = Mallard of int
    type Bird = Goose of Goose | Cardinal of Cardinal | Mallard of Mallard

open Birds

If you specify Birds.Goose this gives you the Goose case of the type Goose, and you can't specify it as Birds.Goose.Goose i.e. using the format Module.Type.Case:

let gooseA = Goose.Goose 7 // Type.Case
let gooseB = Birds.Goose 7 // Module.Case
//let gooseC = Birds.Goose.Goose 7 // Module.Type.Case // <-- Doesn't work

Conversely, you must use that form if you wish to fully specify the Goose case of the Bird type i.e. you must specify it as Birds.Bird.Goose:

let gooseBird1 = Goose gooseA // Case
//let gooseBird2 = Birds.Goose gooseA // Module.Case // <-- Doesn't work
let gooseBird2 = Birds.Bird.Goose gooseA // Module.Type.Case

realvictorprm commented 5 years ago

Would this help this problem?
type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose | Cardinal | Mallard
let x  = Goose 7
This code fails. Goose in the Bird DU shadows Goose as a type and turns it into an Atom. This shadowing happens silently and at least to me is surprising.
type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose of Goose | Cardinal of Cardinal | Mallard of Mallard
let x  = Goose 7
The type shadowing here still means I can't move forward, because there's no way to make a Goose....
type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose' of Goose | Cardinal' of Cardinal | Mallard' of Mallard
let x  = Goose' (Goose 7)
This works. This kind of situation happens where someone created a single case DU, and it gets consumed by someone who can't muck with the original DU for fear of breaking existing code.

This code would be particulary useful for reusing existing cases and avoiding to nest it.

abelbraaksma commented 5 years ago

@dsyme,although the discussion has been dormant a bit in this topic, I think this is still a very valuable addition to the language.

Perhaps we could move this forward and mark it approved in principle, so that we can start working it out into an RFC? I'd be willing to put in the effort for that.

cartermp commented 4 years ago

Some additional motivation: https://www.reddit.com/r/programming/comments/e8wfsr/discriminated_unions_in_c_an_unexceptional_love/fagb0hv/

chkn commented 4 years ago

such types are erased to object

This makes sense if there are value types in the set of types, but if there are reference types only, it makes more sense to me to erase to the most-derived common base class. For instance, given the following:

type A() = class end
type B() = inherit A()
type C() = inherit B()
type D() = inherit A()

I'd expect (B | C) to be exactly equivalent to a declaration of type B, and (B | C | D) to erase to A (but not include A). However, in this case, I'd also expect all the members of A to be available on this value without needing to cast or pattern match.

introducing such a union value would need to be either explicit (e.g. using some new operator like Typed) or type-directed or both

This Typed operator feels a little cumbersome to me. Out of curiosity, why not allow these new anonymous union types to be inferred directly where possible? To adapt the original example:

let generateValue0 () = if Monday then 2 else "4"

This function would have the inferred type unit -> (int | string)

This code would have been illegal before, so I don't think this would be a breaking change to the language. I could see the argument being made that this would make debugging harder if you had actually intended both branches of the if to return the same type, but you can always add a type annotation to enforce that

Going back to @voronoipotato's example, this would be amazing to be able to do:

type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int

// a type abbreviation for an erased anonymous union
// (note the parenthesis to disambiguate from creating a new nominal union type)
type Bird = (Goose | Cardinal | Mallard) 

// inferred to have type (Goose | Cardinal | Mallard) -> string
let talk = function
| Goose _ -> "honk"
| Cardinal _ -> "tweet"
| Mallard _ -> "quack"

Another note: in the CIL metadata, we might be able to enforce type safety and still expose something reasonable to other languages by emitting something more or less equivalent to this pseudo C#:

// Overloads for other .NET languages, ignored by F#
//. We'd only need to emit these for public symbols
public static string talk (Goose arg) => talk ((object)arg);
public static string talk (Cardinal arg) => talk ((object)arg);
public static string talk (Mallard arg) => talk ((object)arg);

// This is the one F# would call
// I believe adding the `modreq` to the argument type would prevent this overload
//  from being used by other .NET languages
public static string talk (
    [ErasedUnion (new[] { typeof (Goose), typeof (Cardinal), typeof (Mallard) })]
    object modreq([FSharp.Core]ErasedUnion) arg)
{
    // actual implementation here
}

Swoorup commented 4 years ago

Was going to add a suggestion but found this thread.

Currently working with Akka and Akkling. It would be great if we are able to restrict the type of Object we consume using purely types.

Another use case is when working with actor frameworks, you would want 2 parents actors in a hierarchy be able to handle messages of the same type, but still be able to handle other messages which are of non-intersecting types, when the child actor sends them.

module Protocol = 
  type Person = ...
  type Animal = ...
  type AllMessages = (Person | Animal | string | int)

module Parent1 = 
  type Msg = (Person | string) 

module Parent2 = 
  type Msg = (Person | Animal)

module Child = 
  open Protocol
  type Msg = (string | int) 
  type ParentMsg = Person

  let child (mailbox: Actor<Msg, ParentMsg>) (msg: ChildMsg) = 
    // can send a message of `Person` without being aware of the sender/parent possible types 
    // parent IActorRef could either be Parent1 or Parent2
    mailbox.Sender().Tell(new Person ())

As F#ers, we prefer strong typing and encoding domain information as much as possible into the type system. However there is no escape if we need to use the massive amount of ecosystem, out there written in C#, as such we could have a type level _ /ignore pattern. We could define a union such as

type Msg = (Person | string | _ )

Here, when we consume this type, we expect Person or string but could also be anything we don't expect or want to handle.

This is erased, and also enforces pattern matching, this would give first class support for F# working with most C# libraries using such pattern with minimal changes.

type Msg = (Person | string | _ ) 

let handleMsg: Msg -> unit = function
  (*  Compiler forces you to match on Person and string
  and will warn if _ isn't handled explicitly  *)
  | Person { name = name } -> printfn "Name: %s" name 
  | str -> printfn "Received string: %s" str 
  | _ -> unhandled ()

let object: System.Object = somecsharpFuncReturnObject ()
object |> handleMsg // implicitly converted to Msg and all Msg types are subset of System.Object

Handling special Akka.net messages would be simple as just defining message for the handling actor without resorting to System.Object

type RootActorMsg = (Akka.Actor.ReceiveTimeout | string | _ )

abelbraaksma commented 4 years ago

I think this draft PR is an attempt at addressing this: https://github.com/dotnet/fsharp/pull/8927

Horusiath commented 4 years ago

@abelbraaksma I don't think this is really a feature covered here. Type unions are powerful tool in the hands of a programmer thanks to the laws they offer (at least in languages implementing them like Scala 3, Ceylon or TypeScript):

They are associative → (A | B) | C is the same as A | (B | C)
They are commutative → A | B is the same as B | A
They are idempotent → A | A and A are the same.

If we extend that over the concept of subtyping:

If we have classes Animal and Cat :> Animal → Cat | Animal is the same as Animal.
Given that F# has bottom type (which is not quite true, but let's say that obj could work in the actual implementation → A | obj and obj are the same.
Given that F# would have subtype of every other type (which is not true) → A | Nothing would always be Nothing.

The PR which you linked doesn't quite fit into any of these laws, it doesn't even fit into the title of this issue (as it's a syntax suggar over choice, which holds type tag and is not erased).

abelbraaksma commented 4 years ago

@Horusiath, thanks, that was very insightful. I wasn't aware of these approaches not overlapping, but now I know I should probably experiment a bit more with languages that do have this feature to understand it better.

jonathan-markland commented 4 years ago

```fsharp
let generateValue0 () = if Monday then 2 else "4"
This function would have the inferred type unit -> (int | string) This code would have been illegal before, so I don't think this would be a breaking change to the language.

For me, the above inference suggestion is a deal-breaker. It turns a common error with 'if' into a nest of knock-on issues as you try and work out why you're suddenly passing an "(A | B)" where a "B" is required.

I like to think of language feature design from two perspectives: The initial coding, and then the re-visitation under the stress of refactoring. For the latter, F# has excellent provision through the interplay of all sorts of little features, and is a large part of the reason why I'm here.

Typescript shares much with C++, namely, incrementally covering over an existing language that itself hadn't started in a great place. I'd be hesitant to import anything compromising from the dynamic/imperative space.

2c/ I could support adding a shorthand type-declaration syntax that de-sugars to a DU (like the example @dsyme posted earlier). Issues would be: To support generating the case constructor names, could such a DU only ever comprise of named types? And what would the prefix for the case-constructors be?

Swoorup commented 4 years ago

let generateValue0 () = if Monday then 2 else "4"
This function would have the inferred type unit -> (int | string) This code would have been illegal before, so I don't think this would be a breaking change to the language.
For me, the above inference suggestion is a deal-breaker. It turns a common error with 'if' into a nest of knock-on issues as you try and work out why you're suddenly passing an "(A | B)" where a "B" is required.

You could potentially enforce specifying type signatures for union types. However I feel, the same could be said for simple types that currently exists. You accidentally return something string instead of int and you have issues in other places. You would ideally restrict the types you consume at a boundary point (perhaps, at the module level or project level)

jonathan-markland commented 4 years ago

I take your point @Swoorup about returning simple types, however my concern was more about this compromising a useful feature of how "if" currently works, than a concern about the many times I've returned something odd accidentally (which mostly happens when refactoring).

It's difficult to accidentally yield the exact-same wrong type from both the if and the else branches, but easy to accidentally yield different ones.

Following your path, would we generalise this, and union the result types across all the arms of a match if they happened to differ? I certainly hope not! Maybe a new "union-match" keyword could allow it...

jonathan-markland commented 4 years ago

Since I can't sleep, and I'm up for a compromise, please consider:

For any given usage of "if" what is the probability of needing the inference feature proposed by @chkn ?
Would the F# committee find it acceptable to have something like a "union" keyword, to prefix "if" (and I think also "match") that would activate union inference behaviour? Otherwise, I propose the behaviour of "if" and "match" would remain as it currently is where type mismatches on the branches are reported directly at source (my preference).

My thoughts:

I am going to guess it's low, otherwise people would have been screaming for this already, but this doesn't mean we shouldn't have it either.
This would have the advantage that the programmer is, on a case-by-case basis, consenting to the new behaviour whilst also clearly indicating intent in the code.

It also means that when we only want a traditional "if", we won't see mistakes leading to unwanted union types, which sail off into the sunset and wash up as a type mismatch somewhere far away. Just because we live with this kind of thing happening for function return type inference, doesn't mean we should exacerbate that problem.

Furthermore, if we were to have a "union" keyword, I'd have it only in front of the "if", and let it distribute across all the "elif"s that there may be. Keep it super-simple. If the programmer wants anything more elaborate, he can nest "if"s or "union if"s.

As an aside, I also thought about this:

What is type "(unit | 'a)" if it isn't a competitor to the option type? And is this a problem?

cartermp commented 4 years ago

The question of what let res = if x then "hello" else 12 does is something that needs careful consideration. I don't think it's a big issue for refactoring, since you're likely going to have a type error somewhere else (unless you just use obj everywhere, in which case I say good luck to you!)

I do wonder if this is something that beginners will struggle with or not. Are they going to know what to do with res in this situation? I'd say there is certainly precedent in F# for writing code or refactoring things to get a different type and having to update how it is used elsewhere. So this doesn't really violate that.

deyanp commented 4 years ago

Is there going to be a real change in beginners or any developer handling of if/match really?

I guess one has the following options: 1) go back and change some of the branches to return the same type as the other branches - what we currently do 2) go back and change all branches to return a named DU (e.g. Result) - what we currently do 3) match on the anonymous DU implicitly returned - NEW

IMHO it is completely logical to add option 3), and then have developers (incl. beginners) choose one of them ...

Tarmil commented 4 years ago

To me the problem is that this has the potential to move type errors to a place where they're much harder to understand. Right now, if then and else return different types, the error is reported right there as "then and else should have the same type". If this now returns a value of type eg "string|int", then maybe this value is going to be passed around and used elsewhere, and you'll get an error like "expected string but got string|int" in a different location, and that can be much harder to investigate. Even as an expert I can see this making my job harder sometimes.

deyanp commented 4 years ago

I think this is a general F# problem whenever you do not explicitly set the type. In other to "localize" your error you must explicitly define the type of the value, e.g.:

let s:string = if true then 1 else "x"

will fix the error to the same line, whereas

let s = if true then 1 else "x"

will let it propagate elsewhere, but as I said, this is a general F# type inference vs explicit types issue ...

Swoorup commented 4 years ago

@cartermp @Tarmil @jonathan-markland

Worth noting that scala3 uses Any (any value objects) for multiple type returns instead of Union type. However once we explicitly state the Union type signature it gives the Union type signature as expected.

So I think perhaps we should only limit union types to expressions when type signature is specified. That could simplify implementation and be prone to less abuse as well.

voronoipotato commented 4 years ago

Any is a dangerous solution to this problem as it can propogate outward until everything is an "any". Typescript often struggles with this. I do like the Int | String anonymous DU though.

cartermp commented 4 years ago

We would not introduce an Any type I think. An excellent description of an issue it can cause is given here: https://latkin.org/blog/2017/05/02/when-the-scala-compiler-doesnt-help/

Now recall the “helpful” compiler behavior mentioned in Example 1. Instead of yelling at me for putting a Symbol in the middle of my Char list, the compiler generalizes and assumes what I really wanted from the start was a List[Any], as Any is the first common ancestor of Char and Symbol.

And comparing a List[Char] with a List[Any] also raises no objections, because in theory it’s not impossible for it to work.

Not a value judgement, but this just isn't really how we do things in F# land.

Shmew commented 4 years ago

So I think perhaps we should only limit union types to expressions when type signature is specified. That could simplify implementation and be prone to less abuse as well.

I quite like this idea, the original example would then look like this:

// error - mismatched if branch
let generateValue0 () = if Monday then 2 else "4"

// valid
let generateValue0 () : Int | String = if Monday then 2 else "4"

No unexpected behavior, and is totally opt-in which solves @jonathan-markland's point.

There should probably be a way to explicitly return an erased union for cases where the type signature could be very large if this direction was taken though.

Swoorup commented 4 years ago

There should probably be a way to explicitly return an erased union for cases where the type signature could be very large if this direction was taken though.

Type aliasing

type LargeUnionComingThrough = (int|string|Decimal....)
let generateValue0 () : LargeUnionComingThrough = if Monday then 2 elif Tuesday "4" ...

?

Would be nice to be able to alias anywhere (inside functions too, not just modules/namespaces)

chkn commented 4 years ago

I tend to agree with @deyanp's comment. This is basically a general issue with type inference. Type annotations can always be added when it's important that an expression have a particular type. Additionally, I often rely on tooling to inspect the inferred types of various expressions. In this case, the proposed anonymous union type is safer and more expressive than the Scala auto-generalization mentioned in the blog post.

So I think perhaps we should only limit union types to expressions when type signature is specified.

I think this is probably about the same difference as adding a Typed or similar operator--additional boilerplate. Ultimately, we need to weigh all these factors, but personally I think that adding syntactic overhead diminishes the value of this feature.

jonathan-markland commented 4 years ago

New syntax proposal II, as my attempt at a weigh-up thus far.

Only trigger type-union behaviour with new keywords: "then union", as shown below.
This would fit in the grammar in where "then" currently lives.

// Inferred type is "(int | string)":
let value = if Monday then union 2 else "4"

// Inferred type is "(int | string | float)":
let value = if Monday then union 2 elif Wednesday then "4" else 3.14   

// Inferred type for this could even be "(int | unit)" - if anyone want this, not withstanding what I said about this being similar to an option type.
let value = if Monday then union 2    

// constrained type example:
let value : (int | string) = if Monday then union 2 else "4"   

// type alias example:
type MyIntAndString = int | string
let value : MyIntAndString = if Monday then union 2 else "4"

Pros

Permit a type-inference solution because to do otherwise just isn't F#
You can still constrain the type if you want, but you don't have to.
You can still alias the type if you want, but you don't have to.
We infer a precise type always, there is no "any" type needed.
As beautiful syntax as I can get it(!) with only a tiny amount of syntactic overhead, for a case that I don't (yet) believe will be a common need.
Retain existing semantics and checks for existing syntax.
Allows a great new feature that people want, done in a way I hope won't surprise old-timers or newcomers when they see it for the first time.

Cons

One extra word to type when you want this feature.
What else?

... and then we go on to do something similar for "match" ?

realvictorprm commented 4 years ago

Maybe another solution which is inbetween of explicitly stating types and having type inference would be to rather introduce union as type which can then be used with _ to state that the union type may be inferred.

So e.g.:

let value: union _ = if Monday then 2 else "4"

which infers to (int | string)

jonathan-markland commented 4 years ago

@Swoorup - Your syntax is OK by me, but I'm not in favour of requiring anyone to manually write a type annotation in order to activate a feature. I don't think anything else in F# works like that.

Tarmil commented 4 years ago

@Swoorup - Your syntax is OK by me, but I'm not in favour of requiring anyone to manually write a type annotation in order to activate a feature. I don't think anything else in F# works like that.

Method calls work kind of like that. The type of the receiver must be known.

// This doesn't work:
let f x = x.Trim()

// but this does:
let f (x: string) = x.Trim()

jonathan-markland commented 4 years ago

Hah! I rarely use method calls, but you're right - I forgot that.

It would be nice if we could get this feature specified without requiring a manual type annotation or aliasing, and without losing the "all types are the same" check that "if" and "match" currently do. Pushing those if/match error reports out elsewhere is unacceptable to me, as this impacts everyone, when the justifications for this feature seem to be occasional need in the static-language programming space, and usage by those folk who happen to be doing dynamic language interfacing.

kerams commented 4 years ago

@jonathan-markland

One extra word to type when you want this feature.

What if there are nested ifs or matches? Would only the innermost ones require then union? I favor the required type annotation because it isn't much more verbose than a new keyword, even though your approach is conceptually more in line with how anonymous records work.

in a way I hope won't surprise old-timers or newcomers when they see it for the first time.

I'll wager everyone not familiar with the feature would interpret then union 2 as a function call (I mean just look at the markdown snippet you posted where union is not colored). Plus I wouldn't be suprised if there's existing code exactly like this out there somewhere, and it would break since this addition is not backwards compatible.

voronoipotato commented 4 years ago

the error could suggest explicitly declaring the type, which would resolve that

Swoorup commented 4 years ago

I tend to agree with @deyanp's comment. This is basically a general issue with type inference. Type annotations can always be added when it's important that an expression have a particular type. Additionally, I often rely on tooling to inspect the inferred types of various expressions. In this case, the proposed anonymous union type is safer and more expressive than the Scala auto-generalization mentioned in the blog post.

So I think perhaps we should only limit union types to expressions when type signature is specified.

I think this is probably about the same difference as adding a Typed or similar operator--additional boilerplate. Ultimately, we need to weigh all these factors, but personally I think that adding syntactic overhead diminishes the value of this feature.

The feature if it behaves just like choice I don't think it is even useful. I am more interested in @Horusiath comment on the properties they bring which outweighs the syntactical restriction imho. To quote

They are associative → (A | B) | C is the same as A | (B | C)

They are commutative → A | B is the same as B | A

They are idempotent → A | A and A are the same.

Tbh introducing a new keyword to lesson burden of resolving compiler type issues seems more like a workaround. Perhaps if error message were useful this wouldn't be necessary?

fsharp / fslang-suggestions