dotnet / csharplang

The official repo for the design of the C# programming language
10.9k stars 993 forks source link

Champion "Discriminated Unions" #113

Open gafter opened 7 years ago

gafter commented 7 years ago

See

Design meetings

https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-08-31.md#discriminated-unions https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-26.md#discriminated-unions

gafter commented 7 years ago

I wouldn't expect progress on this feature to precede records.

DavidArno commented 7 years ago

@gafter,

I've started writing a draft proposal around #75. Would you like me to continue with this, or did you have a proposal planned yourself?

gafter commented 7 years ago

@DavidArno I do not expect to invest any significant effort on this until we make progress on records.

DavidArno commented 7 years ago

@gafter,

OK, that's good to know. I'll carry on with my proposal then, but will take time over it as there's no rush.

gafter commented 7 years ago

/cc @agocke

agocke commented 7 years ago

I will be moving my proposal of "closed" type hierarchies from Roslyn to this repo shortly. I also think we should explore "real" discriminated unions and have some thoughts on that, but it's still much more of an exploratory phase for me.

However, I think I'm close to a reasonably complete proposal for closed and an analyzer-based prototype. I'll happily champion this feature, as well.

Richiban commented 7 years ago

One question that I think we need to ask (and I haven't seen anyone ask elsewhere) is whether the case classes can be used as types.

Allow me to illustrate with the example of the Option type:

public enum class Option<T>
{
    None,
    Some(T Value)
}

Obviously Option<T> is a type (an abstract base class), but are None and Some types? Since, in an OO language like C#, ADTs are probably actually implemented as type hierarchies one might be tempted to answer "yes", but I'm not sure it makes much sense.

public void MyMethod(Some<string> someString) // Is this allowed? It doesn't make much sense
{
    // ...
}

I think of ADTs as functioning more like enums, however they're actually implemented. So using each case as a type doesn't make sense, any more than this makes sense:

public enum Colours
{
    Red, Green, Blue
}

public void MyMethod(Blue colour)
{
    // ...
}
alrz commented 6 years ago

whether the case classes can be used as types.

I think it shouldn't be the case with enum class but with another "expanded" syntax which could represent more complex type hierarchies like an AST

class SyntaxNode {
  case class Statement { } // implicitly inherit from SyntaxNode, as in, a "case" of the said type
  case class Expression {
     case enum class Literal { Numeric, String }
  }
}
mcintyre321 commented 5 years ago

I think that feature can be added fairly simply using a custom type, in the same way Tuple<T0, ..., TN> was added.

I maintain a library, OneOf, which adds a OneOf<T0, ..., TN> type which has .Match<TOut>(Func<T0, ..., TOut>, ..., Func<T0, ..., TOut> methods. By using implicit operators to create the OneOf instances from values, the syntax is very terse and comprehensible. Also, the allocations are low because it's a struct and doesn't create intermediate 'builder' objects, unlike some other solutions.

The OneOf<T0, .., TN> type also provides .Switch and .TryGetTX(out TX value, out TRemainder remainer) methods.

Example of using a OneOf as a return value:

public OneOf<User, InvalidName, NameTaken> CreateUser(string username)
{
    if (!IsValid(username)) return new InvalidName();
    var user = _repo.FindByUsername(username);
    if(user != null) return new NameTaken();
    var user = new User(username);
    _repo.Save(user);
    return user;
}

example of Matching

    OneOf<string, ColorName, Color> backgroundColor = "Red"; //uses implicit casts to reduce overhead
    Color c = backgroundColor.Match(
        str => CssHelper.GetColorFromString(str),
        name => new Color(name),
        col => col
   );

As new types are added to the OneOf definition, compiler errors are generated wherever the union is Matched or Switched, as the methods are required to have the correct number of lambda parameters.

This can be included in the BCL without language changes, although I'm sure some syntactical sugar could be sprinkled.


this proposal was originally made at https://github.com/dotnet/roslyn/issues/14208 and at https://github.com/dotnet/csharplang/issues/1524 . Sorry!

Richiban commented 5 years ago

@mcintyre321 Your OneOf type can be described as equivalent to the Either type or Choice type such as that found in F#. However, the Either type is not an alternative to discriminated unions, in fact it is built on top of discriminated unions:

type Choice<'a, 'b> = Choice1Of2 of 'a | Choice2Of2 of 'b
HaloFour commented 5 years ago

@mcintyre321

While your library does accomplish providing a single discriminated union it also demonstrates the degree of boilerplate required to do so which is what this proposal seeks to reduce. Your types also don't work with C#'s recursive pattern matching which will make it much more efficient and much more capable to match over such a type:

var backgroundColor = ...;

// no delegate invocation required
Color c = backgroundColor switch {
    string str => CssHelper.GetColorFromString(str),
    ColorName name => new Color(name),
    Color col => col
};
mcintyre321 commented 5 years ago

@Richiban OneOf<T0, ..., TN> has up up to 33 parameters, so is more useful as a general return object than Either or Choice.

@HaloFour I agree it would be good to have switch and match support built in to the language,. butI would have thought that the delegate calls will be JITted. I'm not sure what the boilerplate you refer to is.

HaloFour commented 5 years ago

@mcintyre321

I'm not sure what the boilerplate you refer to is.

public enum class OneOf<T1, T2>
{
    First(T1 value),
    Second(T2 value)
}

vs. this*

* Yes, I know that you have all of the arities crammed into one, but the file is too large to link to a specific line.

Richiban commented 5 years ago

@mcintyre321 I don't doubt it's usefulness (or the fact that it's better than Either at those situations).

My point was that discriminated unions are a much more general tool that can also solve the problem that Either solves.

I'm not sure how you would propose to implement the equivalent of this using an Either type?

type FavouriteColour =
    | Red
    | Green
    | Blue
    | Other of string
mcintyre321 commented 5 years ago

@Richiban The abilities to naming a DU is useful, but you still get a lot of value with anonymous DUs. Is it a show-stopper to not have named unions (initially at least)?

That said, there are some things that can be done.

A OneOf, as implemented, is a struct, but I think that in F# (by default) they are classes. So you could make OneOf a class too*, and have class FavouriteColour : OneOf<Red, Green, Blue, string> { }. One problem with this is that implicit operators aren't inherited, although I think maybe I saw a proposal suggesting that was coming.

Another alternative for naming is to use a Record type e.g. class FavouriteColour(OneOf<Red, Green, Blue, string> Value).

And you can always use an alias: using FavouriteColour = OneOf<Red, Green, Blue, string>; if it's just the code bloat that's a problem (rather than the lack of a named Type ).

I appreciate none of this is quite as nice as the F# approach, but perhaps the language could be extended to fix some of this. E.g. defining a union class FavouriteColour : OneOf<Red | Green | Blue | string> { } could cause the compiler to output the required implicit operators.

TBH I'm happy with any solution where

@HaloFour cramming it into one file is along the lines of https://referencesource.microsoft.com/#mscorlib/system/tuple.cs , although I admit OneOf.cs has become somewhat larger!

*There's a class-based OneOfBase in the library, but the name isn't great IMO.

HaloFour commented 5 years ago

@mcintyre321

Anonymous union type support would be #399 which is quite a bit different from DUs.

Richiban commented 5 years ago

@mcintyre321

The abilities to naming a DU is useful, but you still get a lot of value with anonymous DUs

The naming is what makes it a discriminated union. Without the names it's just a (type) union (also very useful, in my opinion).

I don't know about the C# language team, but I'm absolutely desperate for a decent discriminated union type in C#. The number of times I'm modelling a domain and I want to say "An instance of this type looks either like this or like that." C# has nothing of the kind and it's really difficult to work around (although some of the new pattern matching features from build might take away some of the pain).

mcintyre321 commented 5 years ago

If you can mint new Types in a single line of code (which you can with Records) then the utility of a Type Union is very close to that of a Tagged/Disciminated Union. But I get your point, that there is a difference.

DavidArno commented 5 years ago

As of C# 7.2, the best way I've found so far to model DU's in C# is to use a class hierarchy that uses 7.2's private protected. If, in a library, you create the base type, Union<T1, T2> with a private protected abstract dummy member, then child classes of that type can only be created in the same assembly. So Case1 and Case2 can be created as sealed types forming a closed hierarchy.

The reason for choosing a class is in anticipation of C# 8's pattern matching features, and the ability to do something like:

TR Foo<T1, T2, TR>(Union<T1, T2> union) 
    => union switch
    {
        Case1 x => Bar<TR>(x),
        Case2 x => Baz<TR>(y),
        _ => default // needed as the expression cannot be exhaustiveness-checked
   };

Of course this still has problems, as the cases of the union cannot be named.

From my experiments, I think unions as structs is likely a non-starter, because of pattern matching. To achieve a "type hierarchy" with structs would need either something like "extension everything", or CLR changes. This is because some way of saying union is int x would be needed. So either a custom is operator would be needed that works with generics too (extension everything) or having DUs be a special type within the CLR so that it understood the relationship between the union and its types. Also, structs suffer from the default problem. What's the default value of a union?

Uses classes seems much easier. "All" the language really needs then is enforcement of the closed hierarchy within the same assembly and extending the NRT rules to make a null union an error via flow analysis.

mcintyre321 commented 5 years ago
[<Struct>]
type FavouriteColour =
    | Red
    | Green
Unchecked.defaultof<FavouriteColour>.Dump(); //IsRed == true

For class DUs, the default is null

DavidArno commented 5 years ago

@mcintyre321,

Interesting that F# took that approach. I still think building on the NRT flow analysis is a far better way of doing it.

Richiban commented 5 years ago

@DavidArno

As of C# 7.2, the best way I've found so far to model DU's in C# is to use a class hierarchy that uses 7.2's private protected

Interesting... I've been doing it up until now with an abstract class with either a private or internal constructor, depending.

I almost always use the private one, so that all the 'case' classes are ~internal to~ nested within the abstract base class, e.g.

public abstract class FavouriteColour
{
    private FavouriteColour() {}

    public sealed class Red : FavouriteColour {}
    public sealed class Green : FavouriteColour {}
    public sealed class Blue : FavouriteColour {}

    public sealed class Other : FavouriteColour
    {
        public Other(string value)
        {
            Value = value;
        }

        public string Value { get; }
    }
}

Note that for brevity I have not included all the equality, string methods etc in the above code.

Also note that, in my example above, the classes Red, Green, and Blue are singletons (unit-like) and are one of the reasons I quite like the idea of having a singleton concept in C#, perhaps such as that in https://github.com/dotnet/csharplang/issues/490.

It would be handy to be able to say something like:

public object class Unit {}

followed by:

Unit x = Unit; // Note that `Unit` is both a type and a value, negating the need 
               // for a manually written `Instance` static member
DavidArno commented 5 years ago

@Richiban,

A private constructor does work, but it means all the "case" classes then have to be nested within it. A private protected member will achieve a similar result, whilst allowing those case classes to be put in their own files.

It nice to have the choice now.

HaloFour commented 5 years ago

@DavidArno

A private constructor does work, but it means all the "case" classes then have to be nested within it. A private protected member will achieve a similar result, whilst allowing those case classes to be put in their own files.

You can use partial to put them in their own files. You just have to repeat the parent class declaration.

Richiban commented 5 years ago

@DavidArno

A private protected member will achieve a similar result, whilst allowing those case classes to be put in their own files.

Sure, as does an internal constructor. The advantage is that you don't need a 'dummy' member.

DavidArno commented 5 years ago

@HaloFour,

You can use partial to put them in their own files...

Technically, I could do that. But I'd prefer to gouge my own eyes out to be honest.

DavidArno commented 5 years ago

@Richiban,

Sure, as does an internal constructor.

Hmm, that's a fair point. I think that in my haste to find a use for private protected, I forgot that having an abstract internal member, or even just an internal constructor, would achieve the same result. Oh well, I can toss private protected back on the scrapheap of useless features once more 😄

jnm2 commented 5 years ago

@DavidArno

Technically, I could do that. But I'd prefer to gouge my own eyes out to be honest.

This confuses me. It's like another kind of namespace. More logical organization is bad?

DavidArno commented 5 years ago

@jnm2, A type should have one, cohesive, responsibility. If you split that responsibility over multiple files, you are weakening cohesion.

For that reason, I dislike the idea of partial classes being used for anything save dealing with auto-generated segments.

jnm2 commented 5 years ago

@DavidArno How does nesting implementations within an abstract class cause the abstract class to have more than one responsibility? The fact that types are declared inside it doesn't make the containing type responsible—the responsibilities are completely contained in their own, nested, types.

DavidArno commented 5 years ago

@Jnm2,

The fact that types are declared inside it doesn't make the containing type responsible

Here we completely disagree. Classes are not namespaces. If I put Foo inside Bar, then the former completely is part of Bar’s responsibility.

jnm2 commented 5 years ago

Nested classes are only related to containing classes in the sense of the containing class being the namespace. I can't think of any other way in which they are related.

dadhi commented 5 years ago

Here is my take which is IMO better than nesting in abstract class.

Pros:

Cons:

Here is representative example from the project which uses the Union from the gist:


    /// UI elements
    public sealed class UI : Union<UI, Text, Input, Button, Check, Panel> { }
    public sealed class Text   : Rec<Text, string> { }
    public sealed class Input  : Rec<Input, (string Content, MessageRef<string> Changed)> { }
    public sealed class Button : Rec<Button, (string Label, MessageRef<Unit> Clicked)> { }
    public sealed class Check  : Rec<Check, (string Label, bool IsChecked, MessageRef<bool> Changed)> { }
    public sealed class Panel  : Rec<Panel, (Layout Layout, ImList<UI.I> Elements)> { }

An example of pattern matching against UI union:

      private static UIElement CreateElement(UI.I ui)
        {
            switch (ui)
            {
                case I<Text> text:
                    return new Label { Content = text.V.V };

                case I<Input> input:
                {
                    var (content, changed) = input.V.V;
                    var tb = new TextBox { Text = content };
                    tb.TextChanged += (sender, _) => changed.Send(((TextBox)sender).Text);
                    return tb;
                }
                case I<Button> button:
                {
                    var (label, clicked) = button.V.V;
                    var b = new Btn { Content = label };
                    b.Click += (sender, _) => clicked.Send(Unit.unit);
                    return b;
                }
                case I<Panel> panel:
                {
                    var (layout, elems) = panel.Value();
                    var orientation = layout == Layout.Vertical ? Orientation.Vertical : Orientation.Horizontal;
                    var p = new StackPanel {Orientation = orientation};
                    elems.Map(CreateElement).Apply(e => p.Children.Add(e));
                    return p;
                }
                case I<Check> check:
                {
                    var (label, isChecked, changed) = check.V.V;
                    var c = new CheckBox { Content = label, IsChecked = isChecked };
                    c.Checked += (s, _) => changed.Send(true);
                    c.Unchecked += (s, _) => changed.Send(false);
                    return c;
                }
                default: 
                    throw new NotSupportedException("The type of UI is not supported: " + ui.GetType());
            }
        }
hickford commented 5 years ago

Discriminated unions are invaluable for domain modelling. They're one of my favourite features of F# https://fsharpforfunandprofit.com/posts/discriminated-unions/

I look forward to discriminated unions in a future version of C#

wanton7 commented 5 years ago

This is the feature I miss most when I switch between TypeScript and C#. It just makes code so much cleaner and readable in some cases when you have discriminated unions in the language.

krzysztofkroczak commented 5 years ago

Hi all, Can anyone explain to me why @Richiban suggestion doesn't work for the exhaustive check (a feature of C# 8.0 currently in preview)

public abstract class FavouriteColour
        {
            private FavouriteColour() { }

            public sealed class Red : FavouriteColour { }
            public sealed class Green : FavouriteColour { }
            public sealed class Blue : FavouriteColour { }

            public sealed class Other : FavouriteColour
            {
                public Other(string value)
                {
                    Value = value;
                }

                public string Value { get; }
            }
        }

Warning CS8509 The switch expression does not handle all possible inputs (it is not exhaustive).

var fc = (FavouriteColour)new FavouriteColour.Blue();
            var ec = fc switch
            {
                FavouriteColour.Blue b => "blue",
                FavouriteColour.Green g => "green",
                FavouriteColour.Red g => "red",
                FavouriteColour.Other { Value: var value} => value
            };

In my opinion, I can't create any other cases because of the private constructor of FavouriteColour.

theunrepentantgeek commented 5 years ago

I don't know for sure, but I can suggest three possibilities ...

... the most likely possibility is that the compiler doesn't have the smarts to work out that all the current implementations of FavouriteColour are specified.

... another possibility is that the compiler is requiring a default clause to handle potential future expansion (say, Purple)

... or, perhaps, allowing for the possibility of another child class of FavouriteColour being defined at runtime.

gafter commented 5 years ago

The reason for the current behavior is that the draft language spec for pattern-matching has no special treatment for this situation, and therefore neither does the compiler.

HaloFour commented 5 years ago

I wouldn't want that behavior changed until discriminated unions were considered as a language feature. The compiler can always relax exhaustive matching rules.

christiannagel commented 5 years ago

Adding nullability to generics, and allowing the generic type T to be either a struct or a class, but returning T? results in the compilation error "error CS8627: A nullable type parameter must be known to be a value type or non-nullable reference type."

    public interface IItemViewModel<out T>
        where T : class
    {
        T? Item { get; }
    }

A workaround is to either opt out of nullable with these generic types, ore duplicate the code for structs and classes. As the runtime needs changes for records and discriminated unions, probably discriminated unions could be used to help with this scenario, e.g.

    public interface IItemViewModel<out T>
        where T : class | struct
    {
        T? Item { get; }
    }
masonwheeler commented 5 years ago

@gafter

I do not expect to invest any significant effort on this until we make progress on records.

What's the rationale there, if I might ask? I don't see any obvious reason why records should be a dependency of DUs. Is there something non-obvious there, or is it simply a question of priorities?

HaloFour commented 5 years ago

@masonwheeler

I'd expect DUs to behave like an enum of record types, so to me it makes sense to nail down the record behavior first, or at least at the same time.

YairHalberstadt commented 5 years ago

@HaloFour. Why just record types? Why not permit arbitrary types?

HaloFour commented 5 years ago

@YairHalberstadt

I didn't mean exclusively record types, but I think the two would work well together and should be a part of the same conversation.

YairHalberstadt commented 4 years ago

Niko Matsakis, who is part of the Rust language design team, has an excellent series of blogs discussing the benefits and limitations of enums (Rusts name for DUs), and proposing a way to combine the benefits of classes with that of DUs. I think it's well worth the read!

http://smallcultfollowing.com/babysteps/blog/2015/05/05/where-rusts-enum-shines/ http://smallcultfollowing.com/babysteps/blog/2015/05/29/classes-strike-back/ http://smallcultfollowing.com/babysteps/blog/2015/08/20/virtual-structs-part-3-bringing-enums-and-structs-together/ http://smallcultfollowing.com/babysteps/blog/2015/10/08/virtual-structs-part-4-extended-enums-and-thin-traits/

I very much hope the LDM considers his ideas in their design for DUs in C#.

agocke commented 4 years ago

@YairHalberstadt Yes! This is an amazing series of posts which I read at the time, although I never saw the last one, so I'll have to catch up.

I was very happy to read these because I think they mirror my general idea of what discriminated unions would look like in C# via an enum struct and enum class approach.

The enum struct would be like Rust enums, where the size of each type is at least the size of the largest type (probably more because the CLR won't allow us to overlap reference and value types), while the enum class would define an inheritance-based tree, with the main difference from inheritance graphs you define today being that enum class would be sealed to the current compilation, meaning that the object graph is closed and we can do exhaustiveness checking during switching.

This does imply a third type of enum, enum interface, which seems interesting as allowing a closed set of interfaces.

hez2010 commented 4 years ago

In my opinion the pattern matching is "incomplete" if there's no discriminated unions. Could we expect to see this feature in C# 9 or 10? :)

jnm2 commented 4 years ago

@hez2010 In https://nodogmapodcast.bryanhogan.net/124-mads-torgersen-c-8/, Mads mentions that the two features they definitely want C# 9 to include are record types and discriminated unions.

gafter commented 4 years ago

Targeting for C# 9.0, along with records. It is not clear if we will do records first without this and add discriminated unions later, or do both at the same time. Either way, we want to do enough of the design work during C# 9.0 that we are confident in our design decisions on records (that they have a path to adding discriminated unions either at the same time or later).

ielcoro commented 4 years ago

Why are discriminated uniones restricted to records? Wouldn't this feature provide much more value if supports regular classes, structs etc?

Also, why are we inventing a new syntax and not considering proben examples from F# or even Typescript:

type Shape = Square | Rectangle

And also directly in the declaration site:

function foo(shape: Square | Rectangle

Looks like this would fit nicely in C#:

class Shape = Square | Rectangle;

public void Foo (Square | Rectangle shape) { }

Just adding so that the team can consider if it proves valuable.