Open gafter opened 4 years ago
If you add [Flags]
to it, is that an error?
There must be a way to convert an integer to the closed enum (unsafe code).
I imagine you could add [Flags]
to it, but it wouldn't have any effect other than possibly generating a warning. That attribute has no language meaning.
Since enum class
is going to be the syntax for discriminated-unions, I vote enum struct
for this. When there is no sub-member, it is practically a closed enum
,
// if you can do this
enum struct Option<T> { Some(T), None }
// there's nothing to stop you from doing this
enum struct Bool { False, True }
I think this is more of a lowering strategy question - whether or not this should be emitted as a proper enum.
If we go with proper enums, we can't manage external values, so I think we will need to emit it as a struct anyways.
@0xd4d,
There must be a way to convert an integer to the closed enum (unsafe code).
Why? I see no reason for closed enums to have any sort of fixed underlying value. If you want to convert, then use a switch expression:
Bool b = i switch {
0 => False,
_ => True
};
@alrz,
enum struct Option<T> { Some(T), None }
is sort of how I envisioned DU's looking (and so agree that enum struct Bool { False, True }
is a good choice for closed enums.
However, isn't there still an issue pattern matching on those as I'd imagined DU pattern matching would be type based, ie I'd write something like:
var x = option switch {
Some v => v,
None => default
};
But with enum struct Option<T> { Some(T), None }
, Some
isn't a subtype of Option
.
I can invent some syntax here:
enum struct ShapeName { Point, Rectangle, Circle }
enum struct Shape
{
Point,
Rectangle(double Width, double Length),
Circle(double Radius)
}
and I'm happy that this is consistent: a closed enum is a DU where the values have no parameters.
And I can imagine how I'd pattern match this stuff:
var area = shape switch {
Point => 0,
Rectangle { Width: var w, Length: var l } => w * l,
Circle { Radius: var r } => Math.PI * r * r
};
But, I've no idea how this lot gets lowered by the compiler. For a struct, Shape
, that contains a Circle
"value", what does shape is Circle { Radius: var r }
mean?
Does it mean that existing enum
s will be updated to be somehow open? Otherwise the terminology is confusing.
@dsaf, existing enums are already open. For:
enum Shape { Point, Rectangle, Shape }
Shape
can take any int
value. Point
, Rectangle
, and Shape
are little more than int constants.
But it would indeed be prudent to talk of them as being open enums to clarify the difference between existing enums and the closed variety.
@DavidArno
isn't there still an issue pattern matching on those as I'd imagined DU pattern matching would be type based
No they wouldn't be types (or members), if it's a struct DU, all parameters get emitted as flat fields into the struct, the rest is compiler magic. See what F# produces for your example as a value DU.
Though I can imagine C# could do a better job utilizing runtime support for FieldOffsetAttribute
if possible (e.g. if all parameters are blittable) - minimizing the struct size.
@0xd4d,
There must be a way to convert an integer to the closed enum (unsafe code).
Why? I see no reason for closed enums to have any sort of fixed underlying value. If you want to convert, then use a switch expression:
Bool b = i switch { 0 => False, _ => True };
That's inefficient unless it's a trivial enum like the above Bool enum. Deserialization should be quick.
It should be as simple as it is today, except the programmer must verify that it's a valid value (or it's undefined behavior)
int value = 1;
if (InvalidValue(value)) throw ...;
unsafe { return (MyEnum)value; }
@0xd4d You can add it yourself, or have a generator to do it.
enum struct Bool {
False, True;
public static explicit operator Bool(int v)
=> v switch { 0 => False, 1 => True, _ => throw InvalidCastEx };
}
I think unlike enums the corresponding "integer tag" is solely an implementation detail for DUs, so I don't think it's a good idea for the compiler to generate that out of the box.
@alrz Your code is the same as @DavidArno's code, a big switch statement which generates a lot of code and causes extra CPU usage at runtime. My solution to use a cast in an unsafe block has the same perf and code size as code we can already use today.
@alrz, "compiler magic" suits me. It's then @gafter's job to solve how to have struct-based DUs (including closed enums) work with a is X b
, rather than mine 👍
@0xd4d, it's very unlikely that the solution to serialization of closed enums and other DUs is to convert them to ints. I guess it all depends on whether closed enums are implemented as a completely different solution to DUs or not. Doing so would be a mistake in my view. So I'd argue that any solution to serializing
enum struct Bool { False, True }
also has to be able to serialize
enum struct Option<T> { Some(T), None }
The only difference with the original proposal would be that False and True in that example wouldn't be compile-time constants so we can't use enum struct
and enum
interchangeably.
That is because an enum
type as it exists today, is just a group of named constants with zero safety guarantees, I'm not sure if it makes sense to try to "fix" that or take the trade-off and favor struct DUs.
why not closed by default, and open as suggested syntax to open them?
Because open is the default today, and we can't change that.
@gafter would your intent to be that adding a new value to a "closed" enum is a breaking change? would this be binary compat breaking change, or just a "switch statement which must pick a value entered unknown territory by not finding a match" exceptional territory. eg: what if you wanted to add trinary logic "not provided by customer" to the bool enum (bad naming example)
@AartBluestoke I believe its true, false and filenotfound.
@AartBluestoke I imagine that adding a new value to a closed enum would be a binary breaking change. Just like b switch { false => 0, true => 1 }
is translated to b == false ? 0 : 1
, when b
is a bool
, I would expect b switch { Bool.False => 0, Bool.True => 1 }
to be translated to b == Bool.False ? 0 : 1
when b
is a Bool
. But this design point could be debated and changed.
I imagine that adding a new value to a closed enum would be a binary breaking change.
How would this be enforced without the C# compiler generating an under the hood default case? And also the user might add a default case himself in which case the compiler must generate code to check the range of values expected have not been modified.
Something like: In library A
public closed enum ObjectAccess
{
Read,
ReadWrite,
Write
}
Consumer of Library A code in assembly B:
switch objectAccess
{
case ObjectAccess.Read:
return new ReadOnlyFooBar(...);
default :
return new ReadWriteFooBar(...);
}
I asume the C# generated default case would actually look like:
switch objectAccess
{
case ObjectAccess.Read:
return new ReadOnlyFooBar(...);
default :
{
if (objectAccess != ObjectAccess.ReadWrite && objectAccess != ObjectAccess.Read)
throw new InvalidOperationException($"Unexpected value: {objectAccess:g}");
return new ReadWriteFooBar(...);
}
}
When library Author adds a new value the the closed enum:
public closed enum ObjectAccess
{
Read,
ReadWrite,
Write,
ReadWriteExclusive
}
This will indeed break binary compatibility, however the problem is now that recompiling B will catch the new value in the default clause without the library author having the chance to have to treat the new value explicitly.
So this feature can only work if the default clause is not allowed for closed enums, which I think would make the feature impractical for any enums with more than a few members.
@popcatalin81
Other languages seem to get by just fine. In e.g. Rust, the default
case doesn't add any extra checks: if there's a default
case, then that will catch all other unhandled values, now and in the future.
Currently, if I have a switch
that I want to be exhaustive, I need to have an explicit case
for every possible enum member and a default
which throws an exception. The downside is that I don't get a compiler warning/error if a new enum member is added, only a runtime exception.
With closed enums, I still need to have a case
for each enum member, but I can avoid having a default
at all, and I can get a compiler warning/error if someone adds a new enum member.
I believe the following spec would match the desired behavior:
The compiler will enforce that: Switching on a closed enum must be exhaustive, A switch on a closed enum must either not have a default case, or have a default case that guarantees an exception. If a default case is not specified the compiler will auto-generate a default path that throws an exception if accessed
No other compiler or runtime changes are needed.
@AartBluestoke
A switch on a closed enum must either not have a default case, or have a default case that guarantees an exception.
That seems unnecessary. What if the closed enum has 20 values but I only care about 2 of them? Why can't I have a default that returns some value instead?
The intent was not to require that switches be exhaustive. We'd still permit a switch expression to be non-exhaustive, with a warning. And I don't think we'd add any requirements for switch statements, other than the fact that we can infer that a default case isn't needed (e.g. for definite assignment) if you have cases for every enumeration constant.
A switch on a closed enum must either not have a default case, or have a default case that guarantees an exception.
That seems unnecessary. What if the closed enum has 20 values but I only care about 2 of them? Why can't I have a default that returns some value instead?
if you are only using 2 out of 20 values, then you effectively have open use of a closed enum; you're already saying i don't care if new values get added to this. If that is true then the enum shouldn't be closed. If it is true, you should handle the remaining values.
Perhaps the contextual keyword "default" becomes "and the rest of the values" within the compiler, with the compiler generating a true default branch for exceptions for if non-standard values get into the enum somehow?
My thoughts when i was writing that was if you add a new value to the closed enum, it has to explode, otherwise, "closing" has no meaning other than telling roslyn "don't generate a warning for this one specific situation", and preventing implicit cast from enum to int. and that doesn't seem like a necessary language feature; feels like a job of an analyzer.
if you are only using 2 out of 20 values, then you effectively have open use of a closed enum; you're already saying i don't care if new values get added to this. If that is true then the enum shouldn't be closed. If it is true, you should handle the remaining values.
This is really common in functional languages, and the reason is simple: in some cases you don't care about all the values, and other cases you do.
For example you might have a number of states a method can return. In some cases you want to deal with them all seperately. In others you only care if it was successful or not.
... if you are only using 2 out of 20 values, then you effectively have open use of a closed enum; you're already saying i don't care if new values get added to this ...
Sorry, @AartBluestoke, but that is so not true. If, within a particular context, I explicitly call out two of the twenty values, then that's because - in that context - I'm only interested in those two values. So I'll have a default handler for the rest. But in a situation where I explicitly list all of them (which is likely to occur elsewhere in the code) then I absolutely want a compiler error is a new value is added. And having it be a binary breaking change at that point too would be the icing on the cake.
So a default clause for a closed enum will mean:
No default clause:
It sounds about right for me as a library consumer. There's an escape hatch I can use.
However, I would still like to opt-out of binary breaking changes, if those are added. It may be important to make certain libraries forward compatible with closed enums through some mechanism or another. Example:
public closed enum Choice
{
First,
Second,
Third
}
string AnalyzeChoice(Choice choice)
{
switch choice
{
case Choice.First:
return "Good";
case Choice.Second:
return "Average";
case Choice.Third:
return "Bad";
missing:
return "Unknown"; // Only when value not part of the enum is passed (like a version of the Enum in a future assembly).
//Exhaustiveness checking is still applied to all cases. Recompilation will give an error.
}
}
@DavidArno
And I can imagine how I'd pattern match this stuff:
var area = shape switch { Point => 0, Rectangle { Width: var w, Length: var l } => w * l, Circle { Radius: var r } => Math.PI * r * r };
But, I've no idea how this lot gets lowered by the compiler. For a struct,
Shape
, that contains aCircle
"value", what doesshape is Circle { Radius: var r }
mean?
This may do the job:
[StructLayout(LayoutKind.Explicit)]
public struct Shape
{
public struct Point {}
public struct Rectangle { public double Width; public double Length; }
public struct Circle { public double Radius; }
private closed enum ShapeType : short { Point, Rectangle, Circle }
[FieldOffset(0)]
private ShapeType shapeType;
[FieldOffset(2)]
private Point point;
[FieldOffset(2)]
private Rectangle rectangle;
[FieldOffset(2)]
private Circle circle;
public bool Deconstruct(out Point p)
{
if (shapeType == ShapeType.Point) { p = point; return true; }
else { p = default; return false; }
}
.....
}
However, it won't work that easily with reference types inside
@alrz @DavidArno enum struct
could conflict with value type discriminated unions, if they become a thing. Would be quite unintuitive to have enum class
and enum struct
be unrelated
@john-h-k,
enum struct
could conflict with value type discriminated unions
If we view closed enums as just another type of DU (where the cases are all parameterless) then there is no conflict. Closed enums are just a subset of DUs.
Closed enums should require a name for the zero value, because one can easily acquire that value without unsafe code.
class C
{
private closed enum Number { One = 1 }
private Number? number;
public override string ToString()
{
switch (this.number.GetValueOrDefault())
{
case Number.One: return "one";
}
}
}
Perhaps closed enums don't need to support '=' constant_expression
in enum_member_declaration
at all.
How would people envision these closed enums getting serialized to JSON (say by STJ) ? My first gut reaction would be as string versions of the declared names, with options on how to customize the transformation (lower, upper, camel, etc)
How would people envision these closed enums getting serialized to JSON
That would be out of scope from the language. It would depend on whatever serialization library you were using.
Sure, yeah. I would just going off this original statement, which I hope will hold.
There is no conversion from some underlying type to the type Bool (or only in unsafe code) For example, there is no operator Bool +(Bool, int). Alternately, we might consider them defined but only usable in unsafe code.
That would be out of scope from the language. It would depend on whatever serialization library you were using.
I think the term "out of scope" is easy to misunderstand here. It seems to me that the language has a responsibility to make a conscious decision about how the feature design affects things like serialization and interop with other .NET languages. For example, if the design is likely to introduce a failure mode in existing serializers where values outside the "closed enum" domain can be produced, there should be a conscious decision about whether that is acceptable--maybe it is acceptable, maybe it's not.
@RikkiGibson , that's what I was trying to get at in mentioning this question in the original proposal when discussing serialization and interop: "There is no conversion from some underlying type, or arithmetic operators" (or only in unsafe code).
"There is no conversion from some underlying type, or arithmetic operators" doesn't mean that serializers won't round-trip them via their underlying type. In the same way, right now, there is nothing stopping people who serialize enums do so to and from a string. That what it means to be "out of scope" : the language won't say 'the correct json serialization of this c# type is as follows'
It seems to me that the language has a responsibility to make a conscious decision about how the feature design affects things like serialization
I honestly disagree. Serialization is an entirely unrelated topic to me. There are likely to be dozens (if not hundreds) of varying serialization libraries. The choices they make are going to be all over hte map here and often extremely domain specified. I have no particular expectation or desire for their to be any connection between the concepts of language types and the serialization domains.
That what it means to be "out of scope" : the language won't say 'the correct json serialization of this c# type is as follows'
Yup, that encapsulates how i feel.
forget serializers - maybe the better question is 'will it be possible to access the underlying type outside of an unsafe context'.
I'm a little confused about the adding options scenario with multiple assemblies:
// assembly A, v. 1
closed enum Foo { A, B, C, }
// assembly B, v.1; references A v. 1
int Bar(Foo foo)
{
switch(foo)
{
case Foo.A: return 1;
case Foo.B: return 2;
case Foo.C: return 3;
// no need for default because we accounted for all possibilities known to us now
}
}
// assembly A, v. 1.1
closed enum Foo { A, B, C, D, }
// assembly C; references A v. 1.1 and B v. 1
Bar(Foo.D); // what happens in this case?
Also, another aspect of this that I couldn't find a clear answer for in this thread: how are these proposed to be stored? Will there be a (usually integer) backing value (leaving aside the restrictions on casting etc.)? Or is this a completely different mechanism?
I think this can be covered by tagged unions (discriminated unions).
Closed enums should require a name for the zero value, because one can easily acquire that value without unsafe code.
class C { private closed enum Number { One = 1 } private Number? number; public override string ToString() { switch (this.number.GetValueOrDefault()) { case Number.One: return "one"; } } }
Perhaps closed enums don't need to support
'=' constant_expression
inenum_member_declaration
at all.
I'd like to second this. My understanding is that if I declare an enum as a class field, it is automatically initialized to 0, whether or not that is a valid enum value, so closed enums should be forced to always have a value that's equal to 0:
// Should error, because the closed enum doesn't have a value that's equal to 0
closed enum UserType {
Admin = 1,
Normal = 2
};
class User {
// Automatically initialized to 0
private UserType _type;
}
Since number conversions seems to be a problem for the closed enum, maybe it's worth considering this matter from the other perspective? I think that it may cause less issues to be resolved, while having identical effect of solving the mentioned problem.
Just to pipe up about syntax, what's wrong with sealed enum
instead of closed enum
or enum struct
? Saves either adding a new keyword or using very unintuitive keyword combinations.
Just to pipe up about syntax, what's wrong with
sealed enum
instead ofclosed enum
orenum struct
? Saves either adding a new keyword or using very unintuitive keyword combinations.
sealed
means "cannot be inherited", and in addition to being confusing because of the different meanings, it can also create a problem in the future if "enum inheritance" is introduced.
I think I have bigger problem with closed enum's behavior more than the syntax itself. The sole purpose of introducing the closed enum seems to be to influence switch
statement behavior, so why don't we work on switch
, but instead on enum
?
For once, strict switch allows for more flexibility, because it can guard literally anything, what is enumerable and finite (namely bool, bool?, all int and int? types and also enums). Also, sealed enum looks a little bit weird for me, because the place where it is declared and where it actually takes effect are different. You'd need to lookup the type to know what the behavior in the code is, whereas in case of strict switch it is obvious at first glance. Finally, you might want to be strict in switch sometimes and sometimes not. With strict switch notation you can do that freely, while with sealed enum you are forced to be strict every time.
@wojciechsura
Because the switch
expression is effectively already strict. For example, consider these two switch
expressions:
integer switch {
< 0 => -1,
> 0 => 1
}
nullableBool switch {
true => 1,
false => 0
}
They produce these warnings:
warning CS8509: The switch expression does not handle all possible values of its input type (it is not exhaustive). For example, the pattern '0' is not covered. warning CS8655: The switch expression does not handle some null inputs (it is not exhaustive). For example, the pattern 'null' is not covered.
(Note that you need to enable nullable reference types to get the warning about a nullable value type, which is weird, but by design per https://github.com/dotnet/roslyn/issues/52714.)
The problem only happens with enum
s, because the language disagrees with most of their usages regarding which values are valid.
I forgot about this issue and started a new duplicate discussion despite commenting here previously :) Can we please get this at some point? Not as some DU (like some have suggested), but as a good old enum
, just with a runtime check for "closeness".
closed enum HorizontalAlignment
{
Left = 1, Center = 2, Right = 3
}
var x = (HorizontalAlignment)0; // runtime error
// the above line is compiled to:
var x = Enum.ConvertToClosedEnum<HorizontalAlignment>(0); // this helper method would check and throw, if needed
x = (HorizontalAlignment)1; // OK
[Flags]
can be supported by creating a pool of all possible values by bit-wise OR - it's just a set of numbers and not much different from the options' numbers themselves.
Implementing this can be done in a few ways. One approach is to generate the set of possible values and the logic at runtime using IL emitting. Another would be generating a hidden static method at compile time and linking to said method with a hidden attribute on the enum type. There may be other hybrid approaches, as well.
I would also be OK with a blanket prohibition on converting backing values to closed enum
type at compile time (as long as I can do it in reflection/IL emit).
Thank you!
Can we please get this at some point?
@TahirAhmadov THe issue is a championed and is something the LDM is discussing. There is no need to repeatedly ask :)
@TahirAhmadov, would you expect an InvalidCastException here:
closed enum HorizontalAlignment
{
Left = 1, Center = 2, Right = 3
}
class C
{
HorizontalAlignment M()
{
int[] ints = { 1, 2, 3, 4 };
var alignments = (HorizontalAlignment[])(object)ints;
return alignments[3];
}
}
If the enum is not closed, then the CLR allows this cast at run time, even though the C# standard doesn't. If this must throw with a closed enum, then I think it can be implemented in several ways:
@KalleOlaviNiemitalo Yes, I would expect an exception to be raised there. Minor point - it should be a new exception type, like InvalidEnumValueException
.
My original thought was an attribute like the one you mentioned - [ClosedEnum]
- but now that you listed other approaches, I kind of like the ClosedEnum
type solution more (if it's possible). It may even be that ClosedEnum
does not inherit from Enum
, or that Enum
is changed to inherit from ClosedEnum
; I can't say for sure what will work best.
I don't think "transpiling the cast" approach is feasible - what if generics are used and T
is HorizontalAlignment[]
?
PS. It would be cool if we could have a static abstract method on ClosedEnum
type and it's then implemented by concrete closed enum
types.
Related to https://github.com/dotnet/csharplang/issues/485, it would be helpful to be able to declare an enumerated type that is known to the language to be closed. The syntax is an open question, but as a first proposal I'll suggest using
closed
as in https://github.com/dotnet/csharplang/issues/485.For example, a closed enum
Could be used like this
This code would be considered OK by flow analysis, as the method returns for every possible value of the parameter
b
. Similarly a switch expression that handles every possible declared enumeration value of its switch expression of aclosed enum
type is considered exhaustive.To ensure safety:
closed enum
types are not defined. For example, there is nooperator Bool +(Bool, int)
. Alternately, we might consider them defined but only usable inunsafe
code.Bool
(or only inunsafe
code)Design meetings
https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-26.md#discriminated-unions