Open terrajobst opened 5 years ago
HttpMethod is another example. It also has to be possible to set values different from identifier names
public enum OSPlatform : string
{
FreeBSD = "free bsd",
Linux = "linux",
Windows = "windows"
}
Seems like something that could be hammered out alongside DUs, which also feel like enums.
IMO, the syntax in this case should also allow for the named member to reference a different string in the case that the string either doesn't fit with the naming conventions of C# members or doesn't fit the rules for being an identifier.
I've always wanted this, and would further want the equality check/deserialization to be case-insensitive, or at least configurable for that. For example, "Linux"
and "linux"
could both deserialize to OSPlatform.Linux
.
HttpMethod as you say and HeaderNames also. HeaderNames having non-C# values in the actual string values such as -
in Content-Length
; so being able to specify the string much like the actual numeric value can be specified on current enums would be good.
In Azure SDK for .NET, so far we've settled on a structure defined like in https://gist.github.com/heaths/d105148428fe09a2631322b656f04ebb. The main problem comes from a lack of IntelliSense built into VS or VSCode/OmniSharp. If there were a way to enabled this - perhaps through Roslyn - such that MyStruct x =
would pop visible static readonly fields or properties, that would satisfy much of the concern around discoverability.
Another question is if [Flags]
should be somehow supported for these.
And switch
statement/expression support.
Yes for [Flags]
support! Not entirely sure how to represent it from an API perspective, but it seems natural to have string enums implement IEnumerable<string>
to yield all of the set "bits".
Also, and without giving it much thought, what if tuples could be used to specify multiple acceptable variations of the string value:
[Flags]
public enum OSPlatform : string
{
Linux = ("Linux", "linux", "LINUX")
}
I'd also like to see something that interops well with Xamarin.iOS/Mac. String enums are a large part of the native API surface of macOS and iOS. Swift's enums were designed with this in mind as well.
I'm not sold on flags support. This would either require string parsing with some separator or storing them as a collection, which seems unnecessarily heavyweight for the general case.
@abock
I'd also like to see something that interops well with Xamarin.iOS/Mac. String enums are a large part of the native API surface of macOS and iOS. Swift's enums were designed with this in mind as well.
Could you describe what this would entail?
One thing I love doing with enum-ish strings in C# is to emulate Ruby symbols like this:
namespace StringSymbols
{
public static class OSPlatform
{
public const string FreeBSD = nameof(FreeBSD);
public const string Linux = nameof(Linux);
public const string Windows = nameof(Windows);
}
}
Then use it like this:
using System;
using static StringSymbols.OSPlatform;
namespace StringSymbols
{
class Program
{
static void Main(string[] args)
{
var os = Windows;
if( os == Linux ){
Console.WriteLine("Hello Linux, command line ninja");
}
else if( os == Windows ){
Console.WriteLine("Hello Windows, seattle sunshine");
}
else if( os == FreeBSD ){
Console.WriteLine("Hello FreeBSD, go cal bears, go");
}
}
}
}
Output:
Hello Windows, seattle sunshine
:mount_fuji: :man_playing_water_polo: Creedence Clearwater Revival - Green River
Here is an example of what we do in the AWS .NET SDK to solve this problem. Our main requirement is to be forward compatible with enum values that a service might return in the future.
/// <summary>
/// Constants used for properties of type ContainerCondition.
/// </summary>
public class ContainerCondition : ConstantClass
{
/// <summary>
/// Constant COMPLETE for ContainerCondition
/// </summary>
public static readonly ContainerCondition COMPLETE = new ContainerCondition("COMPLETE");
/// <summary>
/// Constant HEALTHY for ContainerCondition
/// </summary>
public static readonly ContainerCondition HEALTHY = new ContainerCondition("HEALTHY");
/// <summary>
/// Constant START for ContainerCondition
/// </summary>
public static readonly ContainerCondition START = new ContainerCondition("START");
/// <summary>
/// Constant SUCCESS for ContainerCondition
/// </summary>
public static readonly ContainerCondition SUCCESS = new ContainerCondition("SUCCESS");
/// <summary>
/// This constant constructor does not need to be called if the constant
/// you are attempting to use is already defined as a static instance of
/// this class.
/// This constructor should be used to construct constants that are not
/// defined as statics, for instance if attempting to use a feature that is
/// newer than the current version of the SDK.
/// </summary>
public ContainerCondition(string value)
: base(value)
{
}
/// <summary>
/// Finds the constant for the unique value.
/// </summary>
/// <param name="value">The unique value for the constant</param>
/// <returns>The constant for the unique value</returns>
public static ContainerCondition FindValue(string value)
{
return FindValue<ContainerCondition>(value);
}
/// <summary>
/// Utility method to convert strings to the constant class.
/// </summary>
/// <param name="value">The string value to convert to the constant class.</param>
/// <returns></returns>
public static implicit operator ContainerCondition(string value)
{
return FindValue(value);
}
}
This proposal is a a special case of Discriminated Unions.
Maybe a more general concept of typed strings, taken from Bosque language?
var code: String<Zipcode> = Zipcode'02-110';
entity PlayerMark provides Parsable {
field mark: String;
override static tryParse(str: String): PlayerMark | None {
return (str == "x" || str == "o") ? PlayerMark{ mark=str } : none;
}
}
So in example scenario from above it would be:
String<OSPlatform> platform = OSPlatform"Linux"
@HaloFour, @aensidhe & @Liminiens,
I don't agree that this proposal is equivalent to DUs. The latter are designed to be closed sets of values, that cannot be extended. Whereas this proposal is specifically requesting that it be a set of values that are open to extension as other enums are.
@DavidArno what do you mean by cannot be extended? To extend enum (both int-based and string-based) we need to write code. Same with DU. I don't get the difference.
String enums in three years are better than discriminated unions in six years.
@dsaf, DUs in one year (C# 9) would be way better than either of your options 😄
@aensidhe, if I declare a DU:
type IntOrBool =
| I of int
| B of bool
I can't then add an S of string
to it without changing the type definition. With this proposal though, for the enum:
public enum OSPlatform : string
{
FreeBSD = "Free BSD",
Linux,
Windows
}
I could write var os = (OSPlatform)"Banana";
as legitimate code. DUs can't be extended; enums can.
(At least this is my understanding of "...there is a need for extensible enums. While enums can in principle be extended by casting any int to the enum, it has the risk for conflicts. Using strings has a much lower risk of conflicts.".)
and HeaderNames also.
I think the important distinction is that HeaderNames should not be a "type" anyways e.g. header names are not restricted to those values, so I think it's better off to be defined as a set of predefined constants.
Great idea, I always wanted enum, which is quite neutral type, could inherit from string type.
As I've downvoted this proposal, I feel I should explain why, even though the majority may disagree with me and downvote this...
To my mind, the statement "...there is a need for extensible enums..." is fundamentally flawed. The fact that enums are extensible causes bugs in code and causes me to have to check that the actual value matches one of the defined values. It's on par with the null
's "billion dollar mistake". And it makes using enums with switch expressions clunky:
enum Values { ValueA, ValueB }
class C
{
public int Foo(Values value)
=> value switch {
Values.ValueA => 0,
Values.ValueB => 1
};
}
gives me the warning, warning CS8509: The switch expression does not handle all possible values of its input type (it is not exhaustive).
. So I need to add a default case to suppress that warning. But if I then add ValueC
to the enum, I get no warning/error that I'm not explicitly handling it.
To my mind, allowing enums to be any int value, rather than just the defined values, was a design mistake. Extending that mistake to strings too would be a bad thing to do.
@DavidArno Your objection has nothing to do with string enums per se, but how enums were implemented in C#. Totally different issues.
@DavidArno 's example with casting any string to the string enum is a good reason to downvote this proposal.
public enum OSPlatform : string
{
FreeBSD = "Free BSD",
Linux,
Windows
}
...
var os = (OSPlatform)"Banana";
...
💥
DU, one more time, is the right and proper way to handle "strongly typed string in BCL" situation. And language should not be changed to fix it by allowing string enums (which are not strongly typed at all), but BCL (and others) should adopt DU approach. Otherwise it's kinda like adding one more floor to a sand castle
and HeaderNames also.
I think the important distinction is that HeaderNames should not be a "type" anyways e.g. header names are not restricted to those values, so
Nor are HttpMethods, platforms etc; the point is enums are an open definition.
example with casting any string to the string enum is a good reason to downvote this proposal.
That is why they are enums and not DUs, much like this is valid for enums currently
public enum OSPlatform : int
{
Linux = 1,
Windows = 2
}
// ...
var os = (OSPlatform)3;
Which is the same point with headers or methods
[CaseInsensitive]
public enum HeaderNames : string
{
Accept = "Accept",
AcceptCharset = "Accept-Charset",
AcceptEncoding = "Accept-Encoding",
AcceptLanguage = "Accept-Language",
AcceptRanges = "Accept-Ranges",
// ...
}
// ...
var requestIdName = (HeaderNames)"X-Request-Id";
if ((HeaderNames)"x-request-id" == requestIdName)
{
// is true
}
@benaadams this proposal shows the demand for enums being more complex than just a set of flags or something. When you really have to do want that - you should use DU’s if they exist in a language, which provide compile time checks for exhaustiveness and extensibility via adding methods to type and new cases.
With enums the only way you can provide safety while casting is using TryParse - style methods.
And there is a question: what would guidelines be when eventually DU’s are added to the language? Would string enums become obsolete because casting values to them isn’t “safe”?
If these "String enums" can just be any arbitrary values of string
then I'm opposed to them too. I think that a named type should bring with it some degree of type safety. While current enums can be any integral value it's a pretty exceptional case for the value to not be one of the declared members, outside of flags enums.
Taking a page from pretty much HTTP package in Java, you often do have common things like HTTP headers and methods expressed as enums, but any method that accepts such an enum also has an overload that accepts a String
.
This sounds like a much better opportunity for proper DUs, where you can have an Other(string value)
case.
I've personally had need for something like this when creating a web API client library where a returned objects options are specified as a set of string values. A fixed set of string values mapped from an enum in the client library was inadvisable because if or when more options are added the client library would break upon deserialization.
To address this need for strongly typed string values I created a StringEnumValue<TEnum>
type which is implicitly convertible to and from both a string
value and a TEnum?
value.
A string based enum would have been a much better option if it had been available at the time.
If these "String enums" can just be any arbitrary values of string then I'm opposed to them too. I think that a named type should bring with it some degree of type safety.
That's kinda the point surely, it does introduce a degree of type safety? If you want to use an arbitrary value you have to intentionally cast it to the enum; it encourages the general case to use the values provided by the string enum, but is open enough to allow other types.
you should use DU’s if they exist in a language, which provide compile time checks for exhaustiveness and extensibility via adding methods to type and new cases.
This is problematic because of layering, many of these enums would be defined in the BCL; however if you want to add another value you are then locked out. e.g. one of the other examples provided is HashAlgorithmName
which in its definition is:
Asymmetric algorithms implemented using other technologies:
- Must recognize at least "MD5", "SHA1", "SHA256", "SHA384", and "SHA512".
- Should recognize additional CNG identifiers for any additional hash algorithms that they support.
So that would be an enum of type
public enum HashAlgorithmName : string
{
Md5 = "MD5",
Sha1 = "SHA1",
Sha256 = "SHA256",
Sha384 = "SHA384",
Sha512 = "SHA512"
}
The current situation of just accepting arbitrary strings doesn't provide any intention to the parameter or guidance to what the parameter should be via the compiler; which enum strings would provide.
@benaadams
That's kinda the point surely, it does introduce a degree of type safety? If you want to use an arbitrary value you have to intentionally cast it to the enum
IMO those two statements are in direct contradiction to one another. I find the current behavior of (non-flags) enums to be pretty appalling and results in the compiler to be forced to treat any arbitrary integral value as a potential value of that enum. That makes about as much sense as having to treat any arbitrary combination of bits in a bool
as something other than true
or false
.
If the goal is to provide guidance to the user as to common or suggested values for a given parameter I think a better approach would be via attribute and IDE support which wouldn't require any language changes and would also work across any language in the ecosystem.
Otherwise this feature seems to be offering a new type while encouraging users to pass invalid values of that type to methods.
Re: https://github.com/dotnet/csharplang/issues/2849#issuecomment-537298781
A collection of strings or separators, and bitwise operator overloads would only be necessary when attributed with [Flags]
or equivalent.
I find the current behavior of (non-flags) enums to be pretty appalling and results in the compiler to be forced to treat any arbitrary integral value as a potential value of that enum
@HaloFour you're using a definition of Enum which is not the same one used in .Net CTS (common type system).
The CTS supports an enum (also known as an enumeration type), an alternate name for an existing type. For the purposes of matching signatures, an enum shall not be the same as the underlying type. Instances of an enum, however, shall be assignable-to the underlying type, and vice versa. That is, no cast (see §I.8.3.3) or coercion (see §I.8.3.2) is required to convert from the enum to the underlying type, nor are they required from the underlying type to the enum. An enum is considerably more restricted than a true type, as follows:
@popcatalin81
you're using a definition of Enum which is not the same one used in .Net CTS (common type system).
Yes I am, and I'd prefer it if we didn't make that same mistake twice.
This enum : string
doesn't fit that definition either, not without CLR changes.
@HaloFour Your definition is not portable cross language, is not easily interoperable with Native code or COM. Having a String Enum simply be a string, would make the most interoperable solution.
@popcatalin81
Your definition is not portable cross language, is not easily interoperable with Native code or COM. Having a String Enum simply be a string, would make the most interoperable solution.
This proposal doesn't intent to make stringly-enums fit as normal enums according to the CTS either. It intends to spit them out as a normal struct
. So all of those concerns are moot, they are not goals of this feature.
Unless I'm reading that wrong (actually a proposed implementation isn't really given). If they are trying to fit them under proper enum
that would involve CLR changes, which seems like it makes it orders of magnitude more involved than DUs would be, for less type safety.
If the goal is to provide guidance to the user as to common or suggested values for a given parameter I think a better approach would be via attribute and IDE support which wouldn't require any language changes and would also work across any language in the ecosystem.
If you look at the type that becomes implemented from the definition in the proposal, it would support all languages in usage?
With many methods accepting this enum type, how would an attribute work? Looking at HeaderNames (80 values) for example would you specify "use the static properties/fields from this type in preference to arbitrary string but you can use one of those too".
How would this feature push people to use HeaderNames.ContentLocation
in preference to "Content-Location"
, which the enum type would do.
This is important and why the values are defined as properties/static fields not constants is because the interpretation of the values is optimized to do a series of reference equality first, before trying to do StringComparison.OrdinalIgnoreCase
comparisions.
This proposal doesn't intent to make stringly-enums fit as normal enums according to the CTS either. It intends to spit them out as a normal struct. So all of those concerns are moot, they are not goals of this feature.
Unless I'm reading that wrong (actually a proposed implementation isn't really given). If they are trying to fit them under proper enum that would involve CLR changes, which seems like it makes it orders of magnitude more involved than DUs would be, for less type safety.
I don't think this feature would go all the way to CLR additions, however there would be certain advantages to go that route like:
public static TEnum[] GetValues
- Reflection. (Ability to discover a string Enum though reflections using same APIs)
```csharp
if (val.GetType().IsEnum == true) ...
Enum.GetValues(typeof(Foo))
Enum.IsDefined(...)
Enum.Parse(...)
I don't think this feature would go all the way to CLR additions, however there would be certain advantages to go that route
Agreed, but we'd likely have to fix a lot of bugs around the stack as virtually all code assumes that an instance of System.Enum
can be cast to an integer. My making them regular structs, we don't impose any re-interpretation onto anyone -- it's just syntactic sugar in C#.
I don't buy the argument that because any string would be a valid instance of the enum (of course requiring an explicit cast) that this proposal is flawed. For starters, thats how all enums have always worked (you can always hard cast the base type to the enum type). The point of using an enums is better documentation and fewer (if any) magic values in the code.
The problem isn't that one can cast any underlying value to the enum -- it's versioning. It is generally legal to extend the set of enum values in V2. Code compiled against V1 couldn't have known about those values. Robust code will need to handle this gracefully.
@terrajobst
(you can always hard cast the base type to the enum type)
Agreed, however aside from versioning and flags, I'd think it's pretty uncommon for users to hard cast some arbitrary integral value. If someone is doing (DateTimeKind) 99
odds are they're doing something very wrong. As a consumer of that enum I shouldn't have to expect and defend against those kinds of shenanigans. This string enum seems to encourage much more of this, particularly given the use cases provided.
As mentioned the Java world keeps these two quite separate. You end up with overloads accepting either the enum or the string. You have the type safety of the enum but a relatively easy escape hatch. IMO this works pretty well even though it's a bit more verbose.
This comes out of a conversation on Gitter this morning about how exhaustiveness checking in C# for enums in switch
expressions has a poor experience due to the fact that you have to guard against these "invalid" values, so I apologize if I'm taking the position pretty strongly. :)
I think the underlying enum value shouldn't affect semantics beyond backward compatibility. Sometime we do give them semantics like HttpStatusCode
but string enums is that in reverse, we take the actual value and wrap that in an enum which doesn't make sense to me.
For all use cases mentioned the set of constants are just some well-known values but you're almost always free (and most likely correct, depending on context) to pass whatever string you want.
If the issue is the string representation of enum values I think the ability to override ToString should suffice. which in turn require you to have a corresponding Parse method. That doesn't need /depend on enum values themselves to be string.
For all use cases mentioned the set of constants are just some well-known values but you're almost always free (and most likely correct, depending on context) to pass whatever string you want.
Yes; a preferred an well-known set of strings you are encouraged to use, however can go outside these values with a different string. switch
statements would likely have to force use of a default
case for the string
type enums.
If the issue is the string representation of enum values I think the ability to override ToString should suffice. which in turn require you to have a corresponding Parse method. That doesn't need /depend on enum values themselves to be string.
This helps with serialization; but not particularly with the scenarios given where the value is the string and it is the string that is used. A solution suggested it to have 2 overloads, one for the enum and one for the arbitrary string https://github.com/dotnet/csharplang/issues/2849#issuecomment-537488436.
In practice this means a doubling the methods; also how is the value held? If its stored as a string, that means any use of the enum is a slower approach as the first thing the method needs to do is .ToString
the enum, and likewise .Parse
the string value to return it (and what's its value for the unknown set of strings?)
If its stored as a couplet, either a nullable enum or a nullable string, paired with its opposite; then again to use the actual value you need to null check and .ToString
. I might be missing something, however that seems to me that is adding constant work (method calls) in the implementation of actually using the enums in practice (even if its handled by the compiler); vs just emitting the value's memory location (string)
@HaloFour
Agreed, however aside from versioning and flags, I'd think it's pretty uncommon for users to hard cast some arbitrary integral value.
Whether it's rare or not, several of us have already built APIs to address this specific concern. Multiple parties seem to have invented the same pattern (BCL, ASP.NET, Azure SDK, and AWS SDK). So I'm not proposing a new pattern here, I'm merely proposing codifying the one that is already in use.
As mentioned the Java world keeps these two quite separate. You end up with overloads accepting either the enum or the string. You have the type safety of the enum but a relatively easy escape hatch. IMO this works pretty well even though it's a bit more verbose.
I don't see the value of keeping them separate. In practice this will cause a lot of noise where APIs have to take both representations. But the worst part is data holders because you know have to store both, the enum and raw string or do continuous translation. That doesn't make sense to me. It's much easier if the enum holds the value.
@alrz
That doesn't need /depend on enum values themselves to be string.
It doesn't, but in many cases you need a way to get to the underlying string. If the mapping is complete, you can achieve this by ToString()
/Parse()
methods. But the whole premise of this proposal is that the set of values isn't static/closed.
But the whole premise of this proposal is that the set of values isn't static/closed.
So (1) these values need to be passed around as "string". (2) all strings can be meaningful given the right context (3) there's already a predefined set of well-known values to be used.
I conclude that as a typed hint to user, e.g. one might not be aware that HeaderNames
even exists somewhere.
That being said, the "string" is not special here at all, you could make the same argument for any combination of values. This gets closer and closer to Java enums where you can "construct" each enum member in place, with some arguments.
So the generalization of the proposed syntax is something like:
struct enum OSPlatform(string Name) {
FreeBSD("Free BSD"),
Linux(nameof(Linux)),
Windows(nameof(Windows));
}
Which is quite flexible
struct enum Color(byte R, byte G, byte B) {
Red(0xff, 0, 0),
Green(0, 0xff, 0),
Blue(0, 0, 0xff);
}
var c = new Color(0x33, 0xaa, 0xee);
If you look at it this way, then a "cast" doesn't make sense, you have to write new OSPlatform("..")
, unless, of course, you have defined a conversion operator for it.
This is the exact reason we added StringEnum<T>
to Octokit.NET 😍
(1) these values need to be passed around as "string". ... That being said, the "string" is not special here at all, you could make the same argument for any combination of values.
Which are traits of current enums, though they are currently limited to byte
, sbyte
, short
etc?
So the generalization of the proposed syntax is something like:
Could still be current enum syntax; and have it alias to the underlying type constructors?
enum OSPlatform : string
{
FreeBSD = "Free BSD",
Linux = nameof(Linux),
Windows = nameof(Windows);
}
var wx = new OSPlatform("Windows X");
and
enum Color : (byte R, byte G, byte B) // valuetuple enum
{
Red = (0xff, 0, 0),
Green = (0, 0xff, 0),
Blue = (0, 0, 0xff)
}
var c = new Color(0x33, 0xaa, 0xee);
@benaadams
I suggested it as DU+primary constructor=records with predefined values because you may still want to add members which is now possible because the underlying type is no longer a proper enum.
Sounds like the downvote discussion is getting hung up on the definition of an enum and not what the problem this is attempting to solve.
In a closed system where you control all of the valid values for an enum I agree this proposal does not make sense. This is generally when the package consuming the enums is also the one that defines the enums.
If you have a library that is modeling the input/outputs of an external system your library has to be robust enough to handle the external system changing over time. In my case I can't have the consumers code get exceptions when we fail to marshal the response into the .NET types because of an unexpected enum value returned by the external system. I also can't force the user to update to new version of the library just to take advantage of a new valid value so I need a back door like casting a string into the enum like type.
The easiest and safest choice would be to just model the valid values as strings but then I'm requiring all consumers to know all of the magic strings and I lose all help from the IDE intellisense.
That is the background of why we implemented our version of this in the AWS SDK. It is not perfect but compared to the original version 1 days of the sdk which just used strings it is a lot easier to figure out what are the allowed values.
So maybe enum isn't the right term but anybody that is creating a SDK for a REST API or some other external system that has fields with allowed values will have to come up with a solution and it would be great of .NET had a goto solution.
It seems like everyone who is against DU just assumes that DU can't have "extensibility". It can.
You can have last element of DU to be any string without restriction. And have case-switch-pattern-matching on it. And it covers this case completely.
For those, who says that string enums now are better then DU tomorrow, I urge you to look on all that ref, readonly, in stuff that was added incrementally. I don't think it's a good design.
I suggested it as DU+primary constructor=records with predefined values
Does a DU work where all the types are all exactly the same and the differentiation is the equality based on value semantics? (short cut for strings with reference equality if the "enum" is used)
I know its getting into the weeds a bit in implementation; but how would something like a TryParse
work over a generic type with 80+ values (HeaderNames); would it have to be IEquatable<T>
and then test each one till it found a match? vs a string where you can use .Length
to narrow the search match (if you accepted IgnoreCase and wanted a canonical form, otherwise its already in form).
@benaadams
Does a DU work where all the types are all exactly the same and the differentiation is the equality based on value semantics?
~I think DUs in general will have a way to discriminate between kinds, either by type or an integer tag, regardless of the content. At least that's how F# does it.~
Sorry I was reading that wrong. I think you're asking if we can actually alter the identity of an enum member. That depends on DU spec and if such ability is permitted - because DU member identity is handled by the compiler, I'm not sure if we should be able to override that.
how would something like a TryParse work over a generic type with 80+ values (HeaderNames); would it have to be IEquatable
and then test each one till it found a match?
Yup, that's what we need to do in Java as well to get back an enum member from e.g. a string. The fact that each DU member is distinct from one another regardless of the content is the reason for this. I think this make it less appealing for use in mentioned scenarios.
We've noticed a trend, especially in cloud services, that there is a need for extensible enums. While enums can in principle be extended by casting any int to the enum, it has the risk for conflicts. Using strings has a much lower risk of conflicts.
In the BCL, we've called this concept "strongly typed strings". Examples are:
It would be nice if we could make this a language feature so that instead of this:
one only has to type this:
/cc @pakrym @heaths @JoshLove-msft
Discriminated Unions
As was pointed out by @DavidArno, this won't be solved by discriminated unions because those are about completeness. The primary value of string-based enums is that they are extensible without changing the type definition:
This is vital for things like cloud services where the server and the client can be on different versions.