Open 333fred opened 4 years ago
It is an error for the endpoint of a switch expression arm's block to be reachable. break with an expression is only allowed when the nearest enclosing switch, while, do, for, or foreach statement is a block-bodied switch expression arm.
This limitation seems ok, but still odd. We don't do the same elsewhere. For example, i could have a continue;
inside a switch
inside a foreach
.
Seems like we could just allow the break/continue to bind to the nearest applicable construct.
For example, i could have a continue; inside a switch inside a foreach.
These are all statements. This is inside an expression, which has previously only been break-out-able by throwing.
These are all statements. This is inside an expression, which has previously only been break-out-able by throwing.
Sure... i get that it's new. my only point was: we're allowing statements inside the switch now. And it doesn't seem strange to support the concept of these statements in the switch jumping to other statements.
Again, relevant: ~https://openjdk.java.net/jeps/325~ ~https://openjdk.java.net/jeps/354~ https://openjdk.java.net/jeps/361
// as statement
switch (p) {
case 1 ,2, 3 -> System.out.println("Foo");
case 4, 5, 6 -> {
System.out.println("Bar");
}
};
// as expression
String result = switch (p) {
case 1 ,2, 3 -> "Foo";
case 4, 5, 6 -> {
yield "Bar";
}
};
Java does not allow control statements within the arms of a switch expression:
LABEL1: while (true) {
String result = switch (p) {
case 1 -> "Foo";
case 2 -> {
break LABEL1; // error: Break outside of enclosing switch expression
}
case 3 -> {
continue LABEL1; // error: Continue outside of enclosing switch expression
}
case 4 -> {
return; // error: Return outside of enclosing switch expression
}
case 5 -> throw new IllegalStateException(); // fine
};
}
But it's perfectly fine with switch statements:
LABEL1: while (true) {
switch (p) {
case 1 -> System.out.println("Foo");
case 2 -> {
System.out.println("Bar");
break LABEL1;
}
case 3 -> {
continue LABEL1;
}
case 4 -> {
return;
}
case 5 -> throw new IllegalStateException();
};
}
@333fred your example is missing a semicolon after the switch expression.
@HaloFour There's #1597 for labeled loops and and even older https://github.com/dotnet/roslyn/issues/5883 with a WONTFIX resolution.
@orthoxerox
Nod, just demonstrating examples of switch
expressions/statements in Java as they are developing very similar features, including using break
as a way to return a value from an expression switch arm. I probably didn't need to use labeled loops, I was just throwing a bunch of spaghetti at IntelliJ to see what would compile and what wouldn't and happened to copy&paste that sample here.
IMO it might be worth considering the design choices already made by the Java team as they intend to use switch
statements/expressions at the center of their pattern matching proposals just as C# has and they have been making tweaks to the preview syntax over the past two compiler releases.
@333fred Consider the following code
foreach (var item in items)
{
_ = item switch {
1 => {
continue; // allowed, continues the foreach
break; // not allowed
break 1; // allowed, "returns" from the switch
}
};
}
It feels inconsistent that break;
is not allowed in this situation, when continue;
is. And since there is no ambiguity (break;
is never associated with the switch
expression, while break expr;
always is), I think it makes sense to allow this code.
Oops, I messed up. Java 13 switched to yield
instead of break
to return a value from a switch
. I bet that was because of the confusion between break
ing out of the switch arm vs. returning a value.
@svick
Java 13, for reference/comparison:
for (int item : items) {
int result = switch (item) {
case 1 -> {
continue; // compiler error
break; // compiler error
yield 1;
}
default -> 0;
};
switch (item) {
case 1 -> {
continue; // just fine
break; // just fine
System.out.println(1);
}
default -> System.out.println(0);
}
}
Shame the team rejected break
, continue
and return
expressions. Feels like they would work well here.
@orthoxerox thanks, fixed.
@svick whether continue will be allowed is still an open question, we need to decide whether we'll allow any control flow out of the expression other than a break expression statement. As @HaloFour points out, Java does not allow these, and I'd be lying through my teeth here if I said we weren't inspired by their solutions to enhancing their switch statement here. But there is existing precedent for break referring to a different statement than continue, and while the compiler could figure it out, I'm not convinced that it wouldn't be confusing for the reader yet.
I would like to say that break
as sort of return statement is very confusing. yield
is much better imo.
Since we're bike-shedding, break
makes perfect sense to me. It's always been associated with leaving control of the switch. Having it leave with a value
is totally sensible given the expression-nature of switch-expressions.
The difference is that break
never returned value just broke current control, now it will. While yield
USUALLY return value and in case you wanted to leave control without returning you have to state so explicitly like yield break;
or yield return null;
(they arent equivalent of course but intent is similar) thats why yield
makes more sense.
No wonder Java changed their syntax in the middle of process.
For break expression, I think a larger picture need to be considered. If we compare to other functional languages like F#, match/switch is not the only use case. It is also applicable for if-else, etc.
Infuture, we may also want to write in C#:
var foo = if (condition) { bar(); break 1; } else { break 2; }
Then the break
would look weird and confusing.
I really would like to have something like block expression from Rust.
I didn't play with Rust yet but it looks like good syntax for more "expression-oriented" language with C-style curly braces. And I definitely would like for C# to go into this direction.
With it we could later introduce "if expression" like @qrli suggested:
var foo = if (condition) { bar(); 1 } else { 2 }
or could write multi line lambda expression
collection.Where(a => { DoSomething(); a > 1}
It might look that simply omitting ;
is too "terse" syntax and it might be better to have something more explicit, but I read that Rust programmers don't have any problem with it. Maybe someone can share his experience with it.
But this block expression is orthogonal feature. The proposed break
syntax might still be valuable if we would like to "return" value from some nested blocks (like we do today with return
in methods and lambdas). But is it worth it to introduce new feature just for small convenience that could be used only in switch expression if we would have something like rust like block expression which would handle 90% of use cases ("execute multiple things in a switch-expression arm before returning a value")
@mpawelski Block expressions ala Rust are currently under consideration under #377, though with parentheses instead of curly braces.
If this is still being discussed, I'd like to suggest using out
instead of break
or any other meaningful keyword that can be mistaken.
I explained why I think it is the better solution in the expression-block issue https://github.com/dotnet/csharplang/issues/3086#issuecomment-632601537.
Having
var x = y switch {
< 0 => 100;
< 10 => {
var z = GetMeassures();
out z;
} ;
_ => 0;
};
Feels much better than
var x = y switch {
< 0 => 100;
< 10 => {
var z = GetMeassures();
break z;
} ;
_ => 0;
};
Any thoughts?
Any updates on this?
No, there are no updates on this.
Will it ever gonna make its way around? I just don't like the switch statement syntax. But many times I have to use it because switch expression can't have multiple lines. So either use switch statement or create separate methods for every case of switch expression. This is a much needed feature. Why is it taking so long?
Instead of yield
, why not just return
? This makes more sense to me because similarly to a lambda function the right side of the arrow always returns a value. I.e., this is equivalent:
var x1 = () => { return 10; };
var x2 = () => 10;
I think this would make sense:
var s = "tenable";
var i = s switch {
"tenable" => 10,
_ => {
if (Sun.IsShining) return 100:
return 0;
}
};
Analyzers will pick up the unnecessary verbosity and simplify to _ => Sun.IsShining ? 100 : 0
, but that's besides the point.
Instead of
yield
, why not justreturn
?
That would interfere with allowing statement expressions to contain return
statements, or if return
expressions (#176) are to be considered, as it changes how the flow control would work. It could also easily lead to a subtle bug when refactoring between switch
statements and switch
expressions.
"It could also easily lead to a subtle bug when refactoring between switch statements and switch expressions."
I think it's unfortunate that they're both use swith
, I would have preferred match
, but in any case you should always be careful when refactoring.
I get that return
usually exits from the current method or function, but it is also allowed in lambdas, which are expressions.
If return
is off the table, yield return x
feels better than break x
, IMHO.
If
return
is off the table,yield return x
feels better thanbreak x
, IMHO.
I think this is off table too because yield return
is already valid in iterators which means if you used iterators and expression block in them there would be ambiguity. Its why just yield
and few others were suggested as they dont have this potential issue
I don't see the issue, because scope already resolves this. This is perfectly valid and unambiguous:
string SillyString()
{
IEnumerable<string> Iter() { yield return "Hello World"; }
string Inner() { return string.Join(", ", Iter()); }
var fn = () => { return Inner(); };
return "Silly" switch { _ => new Func<string>(() => { return fn(); })() };
}
That's a lot of returns in one method, and they're al hit in this example. But no problem for C# because scopes.
For the purpose of this proposal it means that a code block after the arrow of a pattern creates its own scope, like a lambda function but slightly different.
@mrwensveen
Thats not the situation i had in mind let me rehash your example a bit:
IEnumerable<string> Iter() {
yield return "Hello World";
var stringified = myObjects switch {
List<string> strings => string.Join(strings, ","),
List<MyType> others => {
string result = string.Empty;
foreach (var other in others)
{
if (other.IsFaulted) return;
else if (other.IsLastItem) break; // This breaks the foreach, not the switch
result += other.ToString();
}
yield return result;
},
_ => {
var message = $"Unexpected type {myObjects.GetType()}"
Logger.Error(message);
throw new InvalidOperationException(message);
}
};
}
Now 2nd yield return should break only out of switch or out of Iter
? Going by current rules it should be out of Iter
but by expression-block rules it should out of switch and while this ambiguity is resolvable by scope it makes yield return
specifically nonideal candidate since you cant tell this at a glance without parsing scopes first
I get what you're trying to say. I wouldn't have a problem with this example, except for the return
on the line with other.IsFaulted
. The switch expression evaluates to whatever you accumulated in result and is assigned to stringified. You could even write yield return myObjects switch { ...
and use yield return
inside of the switch itself.
I would actually prefer a normal return without yield, because it seems unnecessary. This makes the switch expression like a pattern matching lambda (where it's already legal to use return). In the example above, the return on the line with other.IsFaulted would not compile because you're trying to assign void to a variable.
Okay, last attempt, I promise! What about break return
? You'd be close to the nomenclature users are expecting when they see switch
, but you're also explicitly stating that you are returning a value.
var foo = myObject switch {
string s => s,
MyType mt => {
var bob = mt.Bob;
break return ConvertToString(bob);
},
_ => throw new Exception("Invalid object!")
}
This way, when you see yield
, you know you're dealing with iterators, when you see break
, you know you're dealing with switches, and when you see a naked return, you know you're dealing with functions (possibly local or lambda).
Just jumping back in here. I think the problem with break
, yield return
and return
is, that it might get mistaken (you forget the break
before return
etc) and is really hard to spot on code reviews.
I'd like to bring up my suggestion from way earlier in this discussion. out
. Why not using the keyword out
that is know but nether used insidr a body. This is clear about what is happening here and also it is easily to spot in code.
So the example from above would look like this:
var foo = myObject switch {
string s => s,
MyType mt => {
var bob = mt.Bob;
out ConvertToString(bob);
},
_ => throw new Exception("Invalid object!")
}
I still thing this is the best keyword to use and does not interfere with return
and break
, that might be use to exit a loop or a method from within.
This way we could also use yield return
for a method that is returning an IEnumerable
and don't have to do double yield return
s.
Here's a current workaround using lambda helpers to replace this:
switch( authIdentity ){
case AuthIdentity.PhoneNumberIdentity phoneNumberIdentity:
installation.SetPreAuthContextPhoneNumber( phoneNumberIdentity.PhoneNumber );
break;
case AuthIdentity.MessengerIdentity messengerIdentity:
installation.SetPreAuthContextMessengerPageScopedID( messengerIdentity.PageScopedID );
break;
default:
throw new NotSupportedException( $"Unknown auth identity {typeof(AuthIdentity)}" ),
}
with this:
object _ = authIdentity switch {
AuthIdentity.PhoneNumberIdentity phoneNumberIdentity => Do( () => {
installation.SetPreAuthContextPhoneNumber( phoneNumberIdentity.PhoneNumber );
}),
AuthIdentity.MessengerIdentity messengerIdentity => Do( () => {
installation.SetPreAuthContextMessengerPageScopedID( messengerIdentity.PageScopedID );
}),
_ => throw new NotSupportedException( $"Unknown auth identity {typeof(AuthIdentity)}" ),
}
This uses the following lambda helper (which is useful in all sorts of other contexts):
/// <summary>
/// Performs the given action, returning the 'no-op' result (fundamental C# limitation).
/// </summary>
/// <param name="action"></param>
/// <returns>NOTE: This object represents 'Void' – containing a "no result' result</returns>
[DebuggerStepThrough]
public static /*void*/object Do( Action action ){
action();
return new object();
}
There is some ugliness needed with the extra object _ =
because C# doesn't let the switch expression get invoked without assigning it to an object (ie: purely for the side-effects) - which is another annoying/unnecessary limitation.
It would be fantastic having this as out-of-box language support. It's not only used for multi-line statements, but even single-line statements that invoke different logic/methods, as shown in the example.
Block-bodied switch expression arms
Summary
This proposal is an enhancement to the new switch expressions added in C# 8.0: allowing multiple statements in a switch expression arm. We permit braces after the arrow, and use
break value;
to return a value from the switch expression arm.Motivation
This addresses a common complaint we've heard since the release of switch expressions: users would like to execute multiple things in a switch-expression arm before returning a value. We knew that this would be a top request after initial release, and this is a proposal to address that. This is not a fully-featured proposal to replace
sequence expressions
. Rather, it is constrained to just address the complaints around switch expressions specifically. It could serve as a prototype for adding sequence expressions to the language at a later date in a similar manner, but isn't intended to support or replace them.Detailed design
We allow users to put brackets after the arrow in a switch expression, instead of a single statement. These brackets contain a standard statement list, and the user must use a
break
statement to "return" a value from the block. The end of the block must not be reachable, as in a non-void returning method body. In other words, control is not permitted to flow off the end of this block. Any switch arm can choose to either have a block body, or a single expression body as currently. As an example:We make the following changes to the grammar:
It is an error for the endpoint of a switch expression arm's block to be reachable.
break
with an expression is only allowed when the nearest enclosingswitch
,while
,do
,for
, orforeach
statement is a block-bodied switch expression arm. Additionally, when the nearest enclosingswitch
,while
,do
,for
, orforeach
statement is a block-bodied switch expression arm, an expressionlessbreak
is a compile-time error. When a pattern and case guard evaluate to true, the block is executed with control entering at the first statement of the block. The type of the switch expression is determined with the same algorithm as it does today, except that, for every block, all expressions used in abreak expression;
statement are used in determining the best common type of the switch. As an example:The arms contribute
byte
,short
,int
, andlong
as possible types, and the best common type algorithm will chooselong
as the resulting type of the switch expression.Drawbacks
As with any proposals, we will be complicating the language further by doing these proposals. With this proposal, we will effectively lock ourselves into a design for sequence expressions (should we ever decide to do them), or be left with an ugly wart on the language where we have two different syntax for similar end results.
Alternatives
An alternative is the more general-purpose sequence expressions proposal, https://github.com/dotnet/csharplang/issues/377. This (as currently proposed) would enable a more restrictive, but also more widely usable, feature that could be applied to solve the problems this proposal is addressing. Even if we don't do general purpose sequence expressions at the same time as this proposal, doing this form of block-bodied switch expressions would essentially serve as a prototype for how we'd do sequence expressions in the future (if we decide to do them at all), so we likely need to design ahead and ensure that we'd either be ok with this syntax in a general-purpose scenario, or that we're ok with rejecting general purpose sequence expressions as a whole.
Unresolved questions
Should we allow labels/gotos in the body? We need to make sure that any branches out of block bodies clean up the stack appropriately and that labels inside the body are scoped appropriately.
In a similar vein, should we allow return statements in the block body? The example shown above has these, but there might be unresolved questions around stack spilling, and this will be the first time we would introduce the ability to return from inside an expression.
Design Meetings
https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-26.md#discriminated-unions https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-08-28.md#block-bodied-switch-expression-arms