dotnet / csharplang

The official repo for the design of the C# programming language
10.95k stars 999 forks source link

Champion "Discriminated Unions" #113

Open gafter opened 7 years ago

gafter commented 7 years ago

See

Design meetings

https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-08-31.md#discriminated-unions https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-26.md#discriminated-unions

orthoxerox commented 4 years ago

A bucket list of my feature requests for DUs:

  1. exhaustiveness checking
  2. deconstruction via property or deconstructor pattern
  3. common fields/properties
  4. support for both class-based and struct-based DUs
  5. multi-level DUs (with flattened representation)
  6. non-generic options inside generic DUs
jnm2 commented 4 years ago

@ielcoro I really want that too. That's not a tagged union though. See https://github.com/dotnet/csharplang/issues/399.

hez2010 commented 4 years ago

It's really useful for WebApi, you no longer need message and code properties for all your models but simply:

public (ErrorResultModel | MyResultModel) GetXXX()
{
    if (......) return new ErrorResultModel { code = 401, message = "unauth" };
    ....
    return new MyResultModel { result = "xxxx" };
} 

And also if you want to append properties in the other model to your model, you don't have to create a new class or return an anonymous object to combine them, but simply:

public (MyModel1 & MyModel2) GetXXXBoth()
{
    return new (MyModel1 & MyModel2) { ...... };
}

Furthermore, typescript has discriminated unions, so it will make it easier to proceed data between front-end and back-end.

HaloFour commented 4 years ago

@SamPruden @hez2010

I believe that you are both referring to the functionality as described in this proposal: #399

Discriminated unions are a bit different. They represent a closed hierarchy of types, but they are their own type hierarchy and you'd refer to them by that base interface or abstract class.

SamPruden commented 4 years ago

I believe that you are both referring to the functionality as described in this proposal: #399

Discriminated unions are a bit different. They represent a closed hierarchy of types, but they are their own type hierarchy and you'd refer to them by that base interface or abstract class.

You're right, sorry. I had both tabs open and am apparently not a very careful github submitter. I'll move it over. Sorry for the clutter!

gafter commented 4 years ago

Prioritization of runtime support for efficient switch on a discriminated union is being tracked internally at https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1009897 and the work is being tracked at https://github.com/dotnet/coreclr/issues/23241

KorneiDontsov commented 4 years ago

Hello! I already use discriminated unions in my work with this pattern:

abstract class Shape
{
    private Shape() { }

    public sealed class Square: Shape
    {
        public decimal A { get; }

        public Square(decimal a) => A = a;
    }

    public sealed class Rectangle: Shape
    {
        public decimal A { get; }

        public decimal B { get; }

        public Rectangle(decimal a, decimal b)
        {
            A = a;
            B = b;
        }
    }
}

Shape can be inherited only by nested types because it has only private constructor. Each derived type is sealed so Shape has no chance to have more derived types then Square and Rectangle.

Shape can be switched in c# 8 this way:

decimal GetArea(Shape shape) =>
    shape switch
    {
        Shape.Square square => square.A * square.A,
        Shape.Rectangle rect => rect.A * rect.B
    };

If programmer think that Shape.Square is less elegant than just Square it can write using static Shape; to achieve this.

I propose to implement just a syntax sugar that hides this pattern from the programmers. It might look like this:

enum class Shape
{
    case class Square
    {
        public decimal A { get; }

        public Square(decimal a) => A = a;
    }

    case class Rectangle
    {
        public decimal A { get; }

        public decimal B { get; }

        public Rectangle(decimal a, decimal b)
        {
            A = a;
            B = b;
        }
    }
}

It's compatible with primary constructors:

enum class Shape
{
    case class Square(decimal A)
    case class Rectangle(decimal A, decimal B)
}

No extra effort needed to implement nested discriminated unions (I give an example from my work):

enum class GameMission
{
    // Score of quest is calculated based on time at which player took the quest.
    case enum class Quest
    {
        case DealDamage(ulong DamageCount)
        case KillEnemies(ulong KillCount)
    }

    // Score of achievement is calculated based on entire player statistics.
    case enum class Achievement
    {
        case DealDamage(ulong DamageCount)
        case KillEnemies(ulong KillCount)
    }

    // But both quests and achievements have a common logic of rewards, extra rewards;
    // both are presented on the mission screen.
}
gdar91 commented 4 years ago

@KorneiDontsov I also use this approach at my work. I'd say it's a poor man's DU. The thing is, even if you put all the possible subclasses in your switch cases, you still get a warning, saying it's not exhaustive. The compiler still doesn't know that the hierarchy is closed.

KorneiDontsov commented 4 years ago

@gdar91 I agreed. But my point is to get the pattern as a basis and extend it. Compiler should understand that the hierarchy is closed where enum class instance is switched. The point is that enum class extends class - it saves all class functionality (members, generic arguments, virtual functions) and adds a functionality to declare cases. For example:

enum class Shape
{
    public abstract decimal Area { get; }

    case class Square(decimal A)
    {
        public override decimal Area => A * A;
    }

    case class Rectangle(decimal A, decimal B)
    {
        public override decimal Area => A * B;
    }
}

I would not like to get DU in C# as something completely new and "pure functional" because in my opinion it will be "poor man's DU" too.

dadhi commented 4 years ago

The compiler still doesn't know that the hierarchy is closed.

It is not hard to write Roslyn analyzer to warn on this and help to insert the missing cases.

gafter commented 4 years ago

It is not hard to write Roslyn analyzer to warn on this

Oh really? What strategy would you use, and how would it handle the following:

class Boolean
{
    private Boolean() {}
    public sealed class True : Boolean { }
    public sealed class False : Boolean { }
    static int M( (Boolean, Boolean) tuple )
    {
        return tuple switch
        {
            (True, False) => 1,
            (False, True) => 2,
            (True, True) => 3,
            // warning, missing (False, False)
        };
    }
}

And given the following program, what would the analyzer do to prevent the compiler from warning that the switch is not complete?

class Boolean
{
    private Boolean() {}
    public sealed class True : Boolean { }
    public sealed class False : Boolean { }
    static int M( (Boolean, Boolean) tuple )
    {
        return tuple switch
        {
            (True, False) => 1,
            (False, True) => 2,
            (True, True) => 3,
            (False, False) => 4,
        };
    }
}

@KorneiDontsov We are planning to have discriminated unions be an extension of the functionality that are provided by classes, not a new and completely different thing. However, there may be a shorthand syntax for declaring them when you need almost none of the general declarations you could put in a class.

KorneiDontsov commented 4 years ago

@gafter, thanks for your feedback. Please, also don't miss a functionality to inherit DU from another class, especially System.Exception.

333fred commented 4 years ago

@KorneiDontsov we almost certainly cannot change System.Exception to be a DU. That would be a massive breaking change for no benefit, as it cannot be a closed type hierarchy anyway.

alrz commented 4 years ago

@333fred It could be for a derived DU base:

enum class E : Exception { A, B }
E e = ...;
e switch { A => ..., B => ...} // exhaustive

That is not limited to Exception (or Attribute, for that matter), for instance, we have two closed type hierarchies derived from SyntaxNode for each compiler.

333fred commented 4 years ago

Ah. Yeah, enum classes will be able to inherit from Exception. I'm not sure what they really buys you, but you can do it. It would likely be better to just have a Result type instead of using exceptions, but you should be able to do that.

JesOb commented 4 years ago

Most of the talk there is bout ReferenceType union -> enum class

But what about value type union -> enum struct?

And even unmanaged enum struct that can replace ugly creating unions through [StructLayout.Explicit] and make it type safe.

forbjok commented 4 years ago

I think that feature can be added fairly simply using a custom type, in the same way Tuple<T0, ..., TN> was added.

I maintain a library, OneOf, which adds a OneOf<T0, ..., TN> type which has .Match<TOut>(Func<T0, ..., TOut>, ..., Func<T0, ..., TOut> methods. By using implicit operators to create the OneOf instances from values, the syntax is very terse and comprehensible. Also, the allocations are low because it's a struct and doesn't create intermediate 'builder' objects, unlike some other solutions.

The OneOf<T0, .., TN> type also provides .Switch and .TryGetTX(out TX value, out TRemainder remainer) methods.

Example of using a OneOf as a return value:

public OneOf<User, InvalidName, NameTaken> CreateUser(string username)
{
    if (!IsValid(username)) return new InvalidName();
    var user = _repo.FindByUsername(username);
    if(user != null) return new NameTaken();
    var user = new User(username);
    _repo.Save(user);
    return user;
}

example of Matching

    OneOf<string, ColorName, Color> backgroundColor = "Red"; //uses implicit casts to reduce overhead
    Color c = backgroundColor.Match(
        str => CssHelper.GetColorFromString(str),
        name => new Color(name),
        col => col
   );

As new types are added to the OneOf definition, compiler errors are generated wherever the union is Matched or Switched, as the methods are required to have the correct number of lambda parameters.

This can be included in the BCL without language changes, although I'm sure some syntactical sugar could be sprinkled.

this proposal was originally made at dotnet/roslyn#14208 and at #1524 . Sorry!

I tried a similar approach before when manually implementing a Result type, and the issue I found with it is that using lambda functions for the match arms makes it impossible to affect the flow of the outer function from within them. (ex. return, break out of a loop, etc)

Personally, I found that that made them borderline useless, as that's almost always what I'd want to do when matching on an error, so I ended up just having to use an if test instead.

Maybe existing CLR features could be used to implement it under the hood, but it would at the very least need some sort of syntactic sugar to allow using control flow statements inside the match arms.

mcintyre321 commented 4 years ago

OneOf has a public bool TryPickT0(out T0 value, out OneOf<T1, T2> remainder )

That might work better for you than the lambda match expression

On Mon, 3 Feb 2020, 08:38 forbjok, notifications@github.com wrote:

I think that feature can be added fairly simply using a custom type, in the same way Tuple<T0, ..., TN> was added.

I maintain a library, OneOf https://github.com/mcintyre321/OneOf, which adds a OneOf<T0, ..., TN> type which has .Match(Func<T0, ..., TOut>, ..., Func<T0, ..., TOut> methods. By using implicit operators to create the OneOf instances from values, the syntax is very terse and comprehensible. Also, the allocations are low because it's a struct and doesn't create intermediate 'builder' objects, unlike some other solutions.

The OneOf<T0, .., TN> type also provides .Switch and .TryGetTX(out TX value, out TRemainder remainer) methods.

Example of using a OneOf as a return value:

public OneOf<User, InvalidName, NameTaken> CreateUser(string username) { if (!IsValid(username)) return new InvalidName(); var user = _repo.FindByUsername(username); if(user != null) return new NameTaken(); var user = new User(username); _repo.Save(user); return user; }

example of Matching

OneOf<string, ColorName, Color> backgroundColor = "Red"; //uses implicit casts to reduce overhead
Color c = backgroundColor.Match(
    str => CssHelper.GetColorFromString(str),
    name => new Color(name),
    col => col

);

As new types are added to the OneOf definition, compiler errors are generated wherever the union is Matched or Switched, as the methods are required to have the correct number of lambda parameters.

This can be included in the BCL without language changes, although I'm sure some syntactical sugar could be sprinkled.

this proposal was originally made at dotnet/roslyn#14208 https://github.com/dotnet/roslyn/issues/14208 and at #1524 https://github.com/dotnet/csharplang/issues/1524 . Sorry!

I tried a similar approach before when manually implementing a Result type, and the issue I found with it is that using lambda functions for the match arms makes it impossible to affect the flow of the outer function from within them. (ex. return, break out of a loop, etc)

Personally, I found that that made them borderline useless, as that's almost always what I'd want to do when matching on an error, so I ended up just having to use an if test instead.

Maybe existing CLR features could be used to implement it under the hood, but it would at the very least need some sort of syntactic sugar to allow using control flow statements inside the match arms.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/csharplang/issues/113?email_source=notifications&email_token=AACDJ6WJBOR7S76EO5RWUODRA7JZHA5CNFSM4DAE4VO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKS6OVI#issuecomment-581298005, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACDJ6QERKVB7CL4UUSP7FTRA7JZHANCNFSM4DAE4VOQ .

forbjok commented 4 years ago

OneOf has a public bool TryPickT0(out T0 value, out OneOf<T1, T2> remainder ) That might work better for you than the lambda match expression

On an Option type I implemented, I have a public bool HasSome(out T value), which I'm guessing is basically the same as you're suggesting.

It does work, but it's far from what I'd consider ideal, and in any cases where there's more than 2 possible options, it still doesn't give you any way to prevent implicit non-exhaustive matches (a huge source of potential unhandled errors), or a clean elegant syntax (ex. Rust's match statement) for doing the matches.

AartBluestoke commented 4 years ago

@forbjok i think that if possible, we should have a simple syntax for the reasonably common option where the Union is a union of distinct types. This should be a compile error if the answer is ambiguous. Perhaps there should be something like 'Flags' as an option, to indicate that these values have to be distinct types?

shoooe commented 4 years ago

@forbjok To be fair TryPickTN provides a remainder which gets shorter as you try to pick data out of a OneOf therefore possibly solving the "non-exhaustive matches" problem. It's still awful though.

shoooe commented 4 years ago

In any case OneOf is not what is championed here. OneOf is closer to an std::variant in C++ (it's tagged on the type) than to a discriminated union in other languages like Haskell and F#.

mcintyre321 commented 4 years ago

It's still awful though

You brute! 😁

shoooe commented 4 years ago

@mcintyre321 I don't mean the library. The library is probably the best that can be achieved in C# at the moment. But it's still an awful experience. That's why I support language level support for tagged unions.

dadhi commented 4 years ago

Regarding the library, I've released recently the V2 of ImTools with the struct-based DU types which allow both unnamed and named (tagged) unions and the cases. Here is the usage example: https://github.com/dadhi/ImTools/blob/master/playground/ImTools.UnionPlayground/Program.cs

           // Unnamed (anonymous) union is fast to declare and use
            var i = U<int, string>.Of(42);
            var s = U<int, string>.Of("hey");
    // One-liner named union definition
    public sealed class BoolOrString : Union<BoolOrString, bool, string> { }

    // A different type from the NamedBoolOrString
    public sealed class OtherBoolOrString : Union<OtherBoolOrString, bool, string> { }

    // Typed union with the typed cases - now you can pattern match via `I<Flag>` and `I<Name>`
    public sealed class FlagOrName : Union<FlagOrName, Flag.item, Name.item> { }
    public sealed class Flag : Item<Flag, bool>   { }
    public sealed class Name : Item<Name, string> { }
            // Refactoring friendly Named cases, with some performance price due the boxing - likely is not important for your case, 
            // except you are designing a performance oriented data structure or being used in performance sensitive spot context.
            // The performance price may be gained back any time by switching to CaseN struct matching.
            switch (x)
            {
                case I<Flag.item> b: return "" + b.Value.Item;
                case I<Name.item> s: return "" + s.Value.Item;
                default: throw new NotSupportedException();
            }

In addition I did a benchmark on F# territory. Get it with the pitch of salt but optimized (though very verbose) variant of pattern matching is only slightly slower than F#ish one: https://github.com/dadhi/ImTools/blob/master/playground/ImTools.UnionCsVsFsBenchmarks/Program.fs#L17


BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18362
Intel Core i7-8750H CPU 2.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100
  [Host]     : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), X64 RyuJIT DEBUG
  DefaultJob : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), X64 RyuJIT
|                                     Method |      Mean |    Error |   StdDev | Ratio | RatioSD |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|------------------------------------------- |----------:|---------:|---------:|------:|--------:|-------:|------:|------:|----------:|
|                                     FSharp |  12.49 ns | 0.031 ns | 0.029 ns |  1.00 |    0.00 |      - |     - |     - |         - |
|                   CSharp_named_case_struct |  15.43 ns | 0.085 ns | 0.080 ns |  1.24 |    0.01 |      - |     - |     - |         - |
|         CSharp_named_case_struct_tag_match |  15.00 ns | 0.162 ns | 0.152 ns |  1.20 |    0.01 |      - |     - |     - |         - |
| CSharp_named_case_struct_match_I_interface | 248.44 ns | 4.045 ns | 3.784 ns | 19.89 |    0.29 |      - |     - |     - |         - |
|      CSharp_named_case_struct_Match_method | 181.99 ns | 0.563 ns | 0.527 ns | 14.57 |    0.06 | 0.1733 |     - |     - |     816 B |
333fred commented 4 years ago

I would ask that further discussion about separate libraries be had on the library repo itself, please, and not in this issue.

dzmitry-lahoda commented 4 years ago

@gafter , could it help if we make Boolean class abstract?

And given the following program, what would the analyzer do to prevent the compiler from warning that the switch is not complete?

class Boolean
{
    private Boolean() {}
    public sealed class True : Boolean { }
    public sealed class False : Boolean { }
    static int M( (Boolean, Boolean) tuple )
    {
        return tuple switch
        {
            (True, False) => 1,
            (False, True) => 2,
            (True, True) => 3,
            (False, False) => 4,
        };
    }
}
gafter commented 4 years ago

You were probably aware that Java is adding pattern-matching (in preview) using their instanceof which is the Java equivalent to is, and an expression form of switch. See http://openjdk.java.net/jeps/305 and https://openjdk.java.net/jeps/361.

Java is also adding records in Java 14 (also in preview). See https://openjdk.java.net/jeps/359. This appears to be largely based on our positional record proposal of a few years ago, with some restrictions:

These restrictions certainly simplify things – defining Equals is not so hard, for example. But in my opinion they undermine many important use cases. In Java's proposal records have special language meaning in clients (e.g. for pattern-matching) and using reflection you can ask if a type is a record and what its components are.

Java is also experimenting with discriminated unions. See https://openjdk.java.net/jeps/360 and http://cr.openjdk.java.net/~briangoetz/amber/datum.html. The idea is that you can declare an interface or class that lists all the permitted derived types. In practice, it appears that a DU whose members are records must itself be an interface, not a class, as a record may not inherit another class. That does make sense as Java has always been more interface-based.

HaloFour commented 4 years ago

Might be worth noting that Java 14 drops on March 17, 2020, with these preview features. The expression form of switch I believe is coming out of preview in this release, although the pattern matching syntax specifically would remain in preview.

The only reason I bring up the timeframes is that it is very interesting to watch both C# and Java designing very similar features at the same time and while both languages have some different philosophies it may be worth noting what did or didn't work per their notes when considering similar situations with C#.

Pyrdacor commented 4 years ago

Why not just use

(int | string) x = 5;
x = "Foo";

This would be also nice in type constraints and matching.

class MyClass<T> where T : int | string | (int | string)
... 
MyClass<int> myIntClass;
MyClass<string> myStringClass;
MyClass<(int | string)> myUnionClass;
svick commented 4 years ago

@Pyrdacor There's a difference between discriminated unions (this proposal) and union types (what you suggested and also already proposed at https://github.com/dotnet/csharplang/issues/399) and both have different use cases.

Pyrdacor commented 4 years ago

Can you explain the difference or give examples for their use cases cause I can't find a real proposal with examples here.

Edit: OK I got it now. You could use Lambda expressions as "type values". But isn't this just syntactic sugar for:

class MyUnion
{
     private (int | string) _value;
     public (int | string) Value => _value switch
     {
         int x => x * x,
         string s => s
     };
}

And this does only need the invention of union types and no new constructs.

TheJayMann commented 4 years ago

The main difference between union types and discriminated unions is that union types allow a "type expression" to have one of any number of different types, while discriminated unions are a type consisting of a choice of labels, each label possibly having data associated with it.

As an example, with union types, a variable named nameOrAge could have a type of either string or int with a type expression of (string|int). Discriminated unions could create similar, I'll use the "case class" example I've seen above.

enum class NameOrAge {
    case class Name(string name)
    case class Age(int)
}

In this example, the only difference is that union types and discriminated unions is the discriminated unions create a new type, but also add labels to their choice. However, discriminated unions don't have to assign a type to a label, and the types don't have to be unique.

enum class CommentAuthor {
    case class Username(string username)
    case class Name(string name)
    case class Anonymous
}

The above construct cannot really be represented by union types without adding extra boilerplate which would result in near identical code which a discriminated union would produce. This is the advantage that discriminated unions have over union types, while union types have the advantage over discriminated unions by not having to create additional types.

Pyrdacor commented 4 years ago

Thanks for clarification. But how would I use such construct and what for? I know the Java enum constructs but never found them useful at all. Is it similar?

orthoxerox commented 4 years ago

@Pyrdacor they are useful if your programming style is more functional. There are also cases when two assemblies define their own operations on the same type hierarchy, so you cannot use virtual methods and have to switch on a type in at least one of them. An explicitly defined DU lets the compiler verify that you have handled every case in a switch.

Pyrdacor commented 4 years ago

Thanks. Can you post a small example how such a switch would look like and how I would create an object of the enum class type please?

Richiban commented 4 years ago

@Pyrdacor Ideally it would all look something like this (remember that the syntax hasn't been settled on, but this is how it should work):

enum class Shape
{
    case Circle(int radius),
    case Rectangle(int height, int width),
    case Line(int length),
    case Point
}

public void ConsumerMethod(Shape shape)
{
    Console.WriteLine(
        shape switch
        {
            Circle(var r) => "A circle ({r})",
            Rectangle(var h, var w) => "A rectangle ({w} x {h})",
            Line(var l) => "A line ({l})",
            Point => "A point",
        });
}

public void ProducerMethod()
{
    ConsumerMethod(new Shape.Circle(radius: 5));
    ConsumerMethod(new Shape.Rectangle(height: 4, width: 5));
    ConsumerMethod(new Shape.Line(length: 10));
    ConsumerMethod(Shape.Point);
}
Pyrdacor commented 4 years ago

Hm ok but I can't see a difference between this and union types. You only need compound types like in Typescript. This would even have more flexibility as you can create multiple types with the same subtypes.

class Shape = Circle | Rectangle | Line | Point;
class AreaShape = Circle | Rectangle;

The switch statement then only needs the type pattern for matching.

public void ConsumerMethod(Shape shape)
{
    Console.WriteLine(
        shape switch
        {
            Circle c => "A circle ({c.radius})",
            Rectangle r => "A rectangle ({r.width} x {r.height})",
            Line l => "A line ({l.length})",
            Point => "A point",
        }
    );
}

public void ProducerMethod()
{
    ConsumerMethod(new Circle(radius: 5));
    ConsumerMethod(new Rectangle(height: 4, width: 5));
    ConsumerMethod(new Line(length: 10));
    ConsumerMethod(new Point());
}
mcintyre321 commented 4 years ago

The compiler could generate one-off named types (or Records), and use them with a OneOf style Union type for tagged unions

On Mon, 24 Feb 2020, 18:56 Pyrdacor, notifications@github.com wrote:

Hm ok but I can't see a difference between this and union types. You only need compound types like in Typescript. This would even have more flexibility as you can create multiple types with the same subtypes.

type Shape = Circle | Rectangle | Line | Point;

The switch statement then only needs the type pattern for matching.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/csharplang/issues/113?email_source=notifications&email_token=AACDJ6TOD2JARK5TAFFNO63REQJ4TA5CNFSM4DAE4VO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMZDC2I#issuecomment-590492009, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACDJ6RMMXJEUB3WWONPJ6TREQJ4TANCNFSM4DAE4VOQ .

gulshan commented 4 years ago

I think with Union types and named tuples with allowed single value, "Discriminated Unions" can become a coding style, rather than being a language feature. This is the case in Typescript- no explicit Discriminated Unions, use tags with Union types.

Pyrdacor commented 4 years ago

I think with Union types and named tuples with allowed single value, "Discriminated Unions" can become a coding style, rather than being a language feature. This is the case in Typescript- no explicit Discriminated Unions, use tags with Union types.

Totally agree. Union types are very useful and are more flexible than discriminated unions. And inventing a new language feature seems totally unnecessary imho. You can do all this with union types, type matching and tuples. I think learning from Typescript here is better than learning from F#. And inventing the wheel again with some new case class syntax feels somehow bad.

ielcoro commented 4 years ago

I couldn't agree more, Typescript has nailed it with union types, and how it then solves discriminated unions. C# needes to follow typescript steps in this way, even the syntax used by union types in Typescript is very much C# like.


On Tue, Feb 25, 2020 at 6:52 AM Pyrdacor notifications@github.com wrote:

I think with Union types and named tuples with allowed single value, "Discriminated Unions" can become a coding style, rather than being a language feature. This is the case in Typescript- no explicit Discriminated Unions, use tags with Union types.

Totally agree. Union types are very useful and are more flexible than discriminated unions. And inventing a new language feature seems totally unnecessary imho. You can do all this with union types, type matching and tuples. I think learning from Typescript here is better than learning from F#. And inventing the wheel again with some new case class syntax feels somehow bad.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/csharplang/issues/113?email_source=notifications&email_token=AAEPMZJ4IO6GZPMD6C76WNTRESWY5A5CNFSM4DAE4VO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM2VBWI#issuecomment-590696665, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEPMZIABJTZKH2BY6KBQJTRESWY5ANCNFSM4DAE4VOQ .

theunrepentantgeek commented 4 years ago

There's one very significant flaw, however.

TypeScript's union types work because it transpiles to JavaScript, allowing any of the types to be passed at runtime.

The CLR underlying C# couldn't be more different - to properly support union types would require significant (i.e. extensive, expansive and expensive) changes to the CLR in addition to the language and compiler changes.

In contrast, discriminated unions can be implemented with the existing runtime, requiring only language and compiler changes.

Plus, I don't believe that union types would actually get you very far towards solving the problems that discriminated unions solve. They get perhaps a third of the way there in a way that looks cosmetically promising, but then falls apart in very type-unsafe ways. I suspect TypeScript gets away with this, again, because it uses JavaScript for its runtime.

Pyrdacor commented 4 years ago

But isn't it better to put some effort into this than inventing strange looking syntax just to ease implementation? In the end millions of developers have to work with it. I'm no fan of doing things in a different way just because the good way is difficult or expensive. Imho this has led to many bad decisions in other languages and software. Recently I had to work with Java and oh boy is it bad in some situations. You can feel that they made bad decisions cause they wanted to force some cool feature onto their existing philosophy and it didn't fit very well. If we want to add stuff to C# we should at least look how other languages have done it. I know typescript works very different but on the other hand working with it feels very similar to C# somehow.

I use C# and typescript and especially union types are a thing that I miss in C#. Maybe we should think about ways to implement union types in a way that runtime changes are as minor as possible. Or the discriminated union syntax should at least be designed in a way that it is compatible with union types later.

It's like the discard operator in the new switch syntax. It feels so damn bad not using the established and well-known default keyword. I don't know the reasons why to use the underscore there but even if there were advantages for implementation this looks like a really bad design decision. The default keyword would be so intuitive and was already there for old style switch. The focus should be the work flow of developers and not the implementation of the runtime etc as this is only done once.

gulshan commented 4 years ago

Maybe I'm wrong, but I think Union types can be implemented without runtime changes. Union types can be "most common base type" (which in many cases will be just object) to the runtime, while compiler makes sure the type constraint is met in code. According to what I have proposed in #399 Union/Intersection types, there will be no actual Intersection or union of type members to constitute union or intersection types. They will be based on type hierarchy / inheritance tree.

Pyrdacor commented 4 years ago

I think reflection wouldn't work well with this approach. I guess the discriminated unions would be implemented by just fallback to the real subtype which is just a class so there is no need for new types like UnionType. Typescript has no real type system. It works like your suggestion. Only the compiler checks correct usage. But I think this is not enough for a strongly typed language. In Typescript you can get in trouble sometimes as the type you got is not the one you specified because in the end it is Javascript without types.

So I think changes to the runtime might be necessary. But I think it's worth it. The new case class syntax feels like a workaround to avoid runtime change effort. And this smells like a bad decision if this is the main reason for doing so.

DavidArno commented 4 years ago

Hm ok but I can't see a difference between this and union types.

Does it really matter then as to what they are called if they appear to be the same thing? As long as the language supplies a means of defining a closed set of labels/types that can be parametized and pattern matched, they can be called Elephants for all I care. It's the functionality that's important, not the name.

You only need compound types like in Typescript. This would even have more flexibility as you can create multiple types with the same subtypes.

class Shape = Circle | Rectangle | Line | Point;
class AreaShape = Circle | Rectangle;

And how would that work in practice with C#? Circle is both a subtype of Shape and AreaShape in that definition. And I assume Circle has to have been previously defined? That doesn't provide the functionality expected of this proposal. Now if you can suggest a way of implementing the following, with runtime type safety, no pre-definition of the types and with full pattern matching support, then you have a solution to this proposal:

class Shape = Circle | Rectangle | Line | Point;
struct AreaShape = Circle(double radius) | Rectangle(double width, double length);
theunrepentantgeek commented 4 years ago

@Pyrdacor wrote

I know typescript works very different but on the other hand working with it feels very similar to C# somehow.

That's because Anders Hejlsberg was, in large part, the designer of the initial versions of both languages.

@gulshan wrote

Union types can be "most common base type" (which in many cases will be just object) to the runtime, while compiler makes sure the type constraint is met in code.

The problem then is that you can't use any members on the on the type other than the ones already declared on object, all you can do is pass it around as a dumb reference. We don't need fancy new syntax in C# to achieve what we can already do by just writing void Foo(object thing).

Remember that IL is itself strongly typed, so the runtime knows that the reference is of type object and won't let you call method Foo() regardless of whether the actual instance has that method or not.

DavidArno commented 4 years ago

@Pyrdacor,

In that case, all you seem to be arguing against is the syntax of:

enum class Shape
{
    case Circle(int radius),
    case Rectangle(int height, int width),
    case Line(int length),
    case Point
}

I'd agree that that doesn't feel right. I'd be happier with:

enum class Shape = 
    Circle(int radius) | 
    Rectangle(int height, int width) | 
    Line(int length) | 
    Point;

But the syntax is something that can be decided later. What's important at the moment is working out the required functionality.

Pyrdacor commented 4 years ago

You only need compound types like in Typescript. This would even have more flexibility as you can create multiple types with the same subtypes.

class Shape = Circle | Rectangle | Line | Point;
class AreaShape = Circle | Rectangle;

And how would that work in practice with C#? Circle is both a subtype of Shape and AreaShape in that definition. And I assume Circle has to have been previously defined? That doesn't provide the functionality expected of this proposal. Now if you can suggest a way of implementing the following, with runtime type safety, no pre-definition of the types and with full pattern matching support, then you have a solution to this proposal:

class Shape = Circle | Rectangle | Line | Point;
struct AreaShape = Circle(double radius) | Rectangle(double width, double length);

Why should Circle be a subtype of Shape? These two are not in a class hierarchy. A Shape is more like a local conditional using statement:

if (x)
    using Shape = Circle;
else if (y)
    using Shape = Rectangle;
...
void myMethod(Shape shape)
{
    if (shape is Circle)
        Console.WriteLine(shape.radius);
    // or
    Console.WriteLine(shape switch
    {
        Circle circle => circle.radius
    });
}

If Circle etc would be a Shape you could just inherit Shape and you won't need discriminated unions. You could just use the base class Shape and then check for real type in switch or if. If it isn't a Shape there is no class hierarchy or dependency and so Circle is no sub type of Shape. Either way discriminated unions have nothing to do with "sub-types" imho. Therefore it shouldn't look like a sub-type syntactically.

For your mentioned problem. Why should I create named types in place? This is imho only necessary to reference the sub-type in the current construct. But as I stated the construct isn't really about sub-types at all.

This looks more like a class which should be able to provide different sets of properties. But why not just use a generic wrapper class then, which provides the "sub object" of type T as a property?

struct AreaShape<T>
{
    public T Sub { get; set; }
}

Ok the naming of the "sub-type" is fixed now but does it really matter?