dotnet / csharplang

The official repo for the design of the C# programming language
11.52k stars 1.03k forks source link

[Proposal] Labeled loops like in Java #1597

Closed AustinBryan closed 2 years ago

AustinBryan commented 6 years ago

Currently the cleanest way to break out of a nested loop is this:

for (int i = 0; i < 5; i++)
    for (int j = 0; j < 3; j++)
        if (i * j == 10) 
            goto outer;

outer: {}

If all we wanted to do was break out of the loops, we have to do outer: {}, which is sort've clunky. What we could do instead is use labeled loops like Java, so we can do this:

outer: for (int i = 0; i < 5; i++)
    inner: for (int j = 0; j < 3; j++)
        if (i * j == 10) 
            break outer;

And instead of:

int j = 0;

for (int i = 0; i < 5; i++)
    inner: for (; j < 3; j++)
        if (i * j == 10) 
            goto inner;

We can just do:

for (int i = 0; i < 5; i++)
    inner: for (j = 0; j < 3; j++)
        if (i * j == 10) 
            continue inner;

The alternative is cleaner, more intuitive, and simpler.

I saw another post where they suggest doing break 1; to break the first loop, but that doesn't refactor well if the number of loops changes, you have to carefully make sure the numbers are correct.

iam3yal commented 6 years ago

@AustinBryan Related #869

Joe4evr commented 6 years ago

Also relevant: dotnet/roslyn#5883

jnm2 commented 6 years ago

I like labeled break much better. Also would continue outer; work?

AustinBryan commented 6 years ago

@jnm2

Also would continue outer; work?

Yes, it should. I just tested it out on Java and it works for that so it should for C#. Labeled breaks are the cleanest, most straight foreward, and least error prone way to do manage nested loops.

quinmars commented 6 years ago

In the last three years I needed only once a goto statement to leave a nested foreach loop. Later on I replaced it by a single foreach loop thanks to LINQ. I don't realy understand the absolute avoidance of goto. There are rare cases (and they are very rare) where it comes in handy to use it and there is nothing wrong about it.

iam3yal commented 6 years ago

@AustinBryan I think that your second case was rejected, you can check the post @Joe4evr linked.

Neme12 commented 6 years ago

So break outer would essentially be the same as goto except that it would jump after the labeled statement? I don't see much value in that.

jnm2 commented 6 years ago

@Neme12 It's clearer about what's going on. More symmetrical with continue outer and guaranteed to be an acceptable kind of goto.

AustinBryan commented 6 years ago

@Neme12 I mean yes, except it reflects more on what we want to do. In this example, what I want to do is break outer, not goto some random empty statement. continue outer is much simpler than the alternative that uses goto

DavidArno commented 6 years ago

Currently the cleanest way to break out of a nested loop is this:

I disagree. the cleanest way is to exit the function, avoiding goto and break:

void LoopForTheFunOfIt()
{
    for (int i = 0; i < 5; i++)
        for (int j = 0; j < 3; j++)
            if (i * j == 10)
                return; 
}

If you end up with code that needs a goto, don't reach for labelled breaks, reach for the refactoring tools and split out a new method.

theunrepentantgeek commented 6 years ago

If you end up with code that needs a goto, don't reach for labelled breaks, reach for the refactoring tools and split out a new method.

I kind of agree - in code reviews, I usually see nested loops as a code smell indicating a need to break out another method.

iam3yal commented 6 years ago

@theunrepentantg

I usually see nested loops as a code smell indicating a need to break out another method.

Wouldn't that be an extreme? I mean, indeed, sometimes it's the case but saying that you usually tend to find nested loops a code smell when a code smell should indicate that there's a problem with the design not with how the code is structured but more importantly this statement shouldn't be taken as rule of thumb because what's applicable to one codebase might not be applicable to another, one might be more imperative in nature whereas the other can be more declarative and so it really depends.

Refactoring too much can also lead to readability and maintainability issues.

theunrepentantgeek commented 6 years ago

Refactoring too much can also lead to readability and maintainability issues.

I totally agree - readability, maintainability - and all the 'ilities (scalability, supportability, etc) are very important.

Nonetheless, I stand by the statement: Usually nested loops are a design smell.

a code smell should indicate that there's a problem with the design not with how the code is structured

I don't understand this - how can well-designed code be structured poorly? Design covers all scales - from the macro architecture (components, services, etc) to the micro (algorithm choice, method signatures, local variable naming, etc).

iam3yal commented 6 years ago

don't understand this - how can well-designed code be structured poorly?

@theunrepentantgeek I guess that this is where we disagree, you seems to think that nested loops is an indication of a code smell whereas I think that it might be the case but you don't always need to refactor nested loops especially in this case where the language can be improved a bit and make the goto statement more convenient to work with so just because some people decide to ban it or don't see any use for it doesn't mean people who do shouldn't use it and instead introduce methods into their codebases that wouldn't really improve anything.

You can structure the code in many different ways, you can structure it through a series of imperative statements, you can use a more procedural approach, you can use a more declarative approach and apply all kinds of functional programming techniques and yet the design can be broken so just because you refactored something into its own method and you gave it a name and it looks good doesn't mean you improved the design, in fact, it might mean the opposite but really context is important here and in this case the kind of codebase you work with should be considered before going out and tell people what they should or shouldn't do.

Now, some people would consider their codebase state of the art because it's refactored and follow awesome set of principles and whatnot only to find out that it's actually poorly designed.

DavidArno commented 6 years ago

...you seems to think that nested loops is an indication of a code smell...

Such code runs counter to the "code should describe the intent; not the mechanism" principle of easy-to-reason (or "reasonable") code. So if I write code like:

var found = false; 
for (var i = 0; i < range1; i++)
for (var j = 0; j < range2; j++)
    if (i * j == 10)
    {
       found = true;
       goto outer;
    }
outer: {}

I'm purely expressing the mechanism. The code is highly unreasonable. Therefore the next thing I'm likely to do is add a comment to explain its intent:

// loop through the two ranges and report if they contain factors of 10
var found = false; 
for (var i = 0; i < 5; i++)
for (var j = 0; j < 3; j++)
    if (i * j == 10)
    {
       found = true;
       goto outer;
    }
outer: {}

Now I have the intent, but I still have the mechanism there too. It's a mess. I'm not sure If it's a code smell, but it certainly is ugly and difficult to read. What I want is the intent only:

var found = RangesContainFactorsOfTen(5, 3);

The code is now trivially easy to reason. And when I want to understand the mechanism, I head off to that method:

private bool RangesContainFactorsOfTen(int range1, int range2)
{
    for (var i = 0; i < range1; i++)
    for (var j = 0; j < range2; j++)
        if (i * j == 10)
            return true;

    return false;
}

So I've achieved code that is far easy to reason and read, I've removed the comment and I've avoided the need for a goto or labelled break (which is just syntactic sugar for a goto).

C# is already complicated enough with a growing number of features with every release. Those new features should be focused on the "pit of success", ie making it easier to write good code. Labelled breaks are the opposite: they just offer another way, in addition to the existing one, of writing bad code. So they they have no place in future versions of the language in my view.

iam3yal commented 6 years ago

@DavidArno First of all, I agree, if there's a value in refactoring, you should do it but my point was that if you have a method that is self-explanatory you wouldn't want to refactor it further just because now you need a goto statement.

Those new features should be focused on the "pit of success", ie making it easier to write good code.

I didn't say that now you should use goto in every piece of code that you write but sparingly when you need it and if the language can help us then why not? now, of course, when you consider the amount of time people use it then indeed, it might not worth the efforts.

Labelled breaks are the opposite: they just offer another way, in addition to the existing one, of writing bad code.

I really don't understand why would you consider label breaks as "bad code" when at rare times it can actually improve the code.

Korporal commented 6 years ago

@DavidArno - In your examples you have no further processing after the label outer: so of course its easy to replace the logic as you have with two return statements. So your example seems deliberately contrived and not representative of the general case.

I recall many occasions where an organization had a coding standard "only one entry point and one exit point" and your example is reminiscent of (though not the same as) this kind of rule.

I recall the pain of having to follow that rule even in the parameter validation (the language had no exceptions) with mounting levels of nested if/then/else - ultimately greatly cluttering the code.

Being able to exit a loop at any arbitrary depth is in fact a powerful simplifying mechanism, its intent is obvious "where 100% done looping here" and its effect is clear "continue at the statement following the end of the exited loop".

Granted there are probably many cases where it can be avoided with possibly simpler design but I'm sure there are cases too were being unable to break like this forces the logic to fiddle around unnecessarily.

Mafii commented 6 years ago

@Korporal you seem to miss what @DavidArno said about extracting methods for each nested loop.

You can rewrite that code so that is easier to read, refactor and understand. Period. Replacing goto with loop lables doesn't help, it just moves the problem. Refactoring and extracting a method for every loop is the way to go. There are tools for this, but you should be able to do it on your own anyways. It is also less error-prone that nested loops in their raw form, and easier to refactor.

Korporal commented 6 years ago

@Mafii - By your reasoning then the existing keywords breakand continueare to be avoided in favor of refactoring to avoid such operations. I'm not suggesting gotobe used at all, only that breakbe optionally qualified by a label which must precede some lexically preceding looping construct, as described in the OP.

The existing break keyword can be read as: break current;where currentis an imagined label on the current loop so would you advocate that developers avoid using breakor continue?

DavidArno commented 6 years ago

@Korporal,

In your examples you have no further processing after the label outer: so of course its easy to replace the logic as you have with two return statements. So your example seems deliberately contrived and not representative of the general case.

"My" examples are simply @AustinBryan's examples, restructured to avoid the need for breaking out of a nested loop. I don't think Austin's examples are contrived at all.

I recall many occasions where an organization had a coding standard "only one entry point and one exit point" and your example is reminiscent of (though not the same as) this kind of rule.

It's the absolute opposite. In fact such refactoring only really works if that silly rule is ignored.

By your reasoning then the existing keywords breakand continueare to be avoided in favor of refactoring to avoid such operations.

You are missing the point. Folk tend to reach for break all too often. Yet all too often, the code can be simplified by using return. Code after the loop being a case in point. How often does that code contain an if to determine whether break occurred, or the loop ended? By using return, you avoid that extra layer of complexity.

It isn't a case of avoiding break and continue; they are often essential. But at the same time, it's always worth stopping and asking whether a return can be used instead. In my experience, it's only when one tries to break out of two levels at once that alarm bells ring and the code is near guaranteed to need refactoring to simplify it.

A real world example. I went looking for somewhere where I'd used break in my own code and found:

public static Option<T> TrySingle<T>(this IEnumerable<T> collection, Func<T, bool> predicate)
{
    if (predicate == null) throw new ArgumentNullException(nameof(predicate));

    var result = Option<T>.None();
    var count = 0;
    if (collection == null) return result;

    foreach (var element in collection)
    {
        if (!predicate(element)) continue;
        if (++count > 1) break;

        result = Option<T>.Some(element);
    }

    return count == 1 ? result : Option<T>.None();
}

This is an example of what I was talking about before. I test count and break if it's greater than one. Then I re-test count to determine the return value. I can ditch that break though in favour of a return and - in the process - remove the test after the loop:

public static Option<T> TrySingle<T>(this IEnumerable<T> collection, Func<T, bool> predicate)
{
    if (predicate == null) throw new ArgumentNullException(nameof(predicate));

    var result = Option<T>.None();
    var count = 0;
    if (collection == null) return result;

    foreach (var element in collection)
    {
        if (!predicate(element)) continue;
        if (++count > 1) return Option<T>.None();

        result = Option<T>.Some(element);
    }

    return result;
}

A tiny change, yet I've reduced the routes through the code and so reduced it's complexity and thus the likelihood of a bug.

jez9999 commented 6 years ago

Considering all the little enhancements they've been introducing to C# (like returning tuples) I think labeled loops with break/continue labels would be a straightforward, useful feature that should be no problem. I'm not sure why this hasn't been done yet.

Joe4evr commented 6 years ago

I'm not sure why this hasn't been done yet.

An LDT member already said this on the matter:

Because the requested change isn't going to happen.

Now, can we please close this thread and move on?

jez9999 commented 6 years ago

Yeah I noticed that. It's ridiculous. Is that how C# gets developed, one arrogant guy dictating everything with no justifications?

DavidArno commented 6 years ago

@jez9999,

The justification is obvious: you can already do this using goto. There are hundreds of new, original ideas that don't just add slightly different syntax for exactly the same thing that are, and should be, in front of this request.

CyrusNajmabadi commented 6 years ago

Yeah I noticed that. It's ridiculous. Is that how C# gets developed

Yes. The LDM decides what goes in the language. It's not a free-for-all where people just propose things and they just get added :)

Note: if you want, you can always fork the language and do whatever you want with it. However, MS controls the direction and decisions around C#-proper.

jez9999 commented 6 years ago

@DavidArno That's not the same, the goto label points to the end of the loop which is much less intuative / easy to read than pointing to the start of the loop. You also can't do a break with that goto; it has to be the equivalent of a continue. You'd need 2 goto labels to choose between a break or a continue.

Korporal commented 6 years ago

@Joe4evr - But what about the fact that this is (probably) quite easy to add to the language, involves no new radical concepts and potentially rather useful in general, this is what I would call "low hanging fruit" apart from the disagreements here (which reflect tastes) adding this should be pretty straightforward and carry pretty low risks.

HaloFour commented 6 years ago

@jez9999

Two labels to accomplish the same task which is considered relatively rare is not an onerous requirement.

@Korporal

There are tons of small ideas and most of them aren't bad. But an idea has to be sufficiently good to be considered.

Joe4evr commented 6 years ago

the goto label points to the end of thee loop which is much less intuative than pointing to the start of the loop.

I, sincerely, disagree entirely. You intend to jump out of the loop, so pointing to the start of the loop is entirely less intuitive.

There are already good, clean ways to break out of nested loops that you can use:

if (found) Console.WriteLine("We found a factor of ten");


Asking for something that actively makes code look *worse* should not be an option.
Korporal commented 6 years ago

@Joe4evr - Introducing state data (found) for the sole purpose of influencing execution flow is surely an admission that the language constructs available to us are insufficient?

I, sincerely, disagree entirely. You intend to jump out of the loop, so pointing to the start of the loop is entirely less intuitive

The label isn't to be regarded as "pointing" at the loop, but rather identifying the loop.

Additionally in the general case setting foundto truestill leaves open the possibility that a later developer might add code following that assignment, code which would be executed but should not be (if assigning trueto foundis to be taken as "exit the loop now"). That is to say that assigning trueto the state variable does not equate to exiting the loop, that will only happen next time the control gets to the start of the loop.

In fact your state variable foundcould justifiably be renamed to exit_this_loop_next_time_we_reach_the_start and when one does that one can see right away the subtle possibilities for complexity and bewilderment.

I've seen these contrived coding tactics many times and what can eventually happen is that the pattern gets continued, additional flags get added or even numeric variables get used and then we have code that's very hard to reason about, stuff like this:

if (something)
   found = 1;

if (found == 1 and number_of_iterations < 2)
   found = 2;

Wrting break inner_loop;entails no state variables and leaves us in no doubt as to what the execution flow will be when the conditions are met and enables the compiler to highlight any unreachable code that could get added after that breakstatement, something that a state variable cannot do.

This whole issues is rather simple IMHO - we either make it difficult to break out of nested loops (by forcing developers to use state variables and so on) or we make it extremely easy and intuitive and support break <label>;

CyrusNajmabadi commented 6 years ago

@Joe4evr - Introducing state data (found) for the sole purpose of influencing execution flow is surely an admission that the language constructs available to us are insufficient?

Of course. But insufficiency is not important enough to warrant a language change. After all, you can look at every part of hte language and find it insufficient in many ways.

The feature actually has to provide substantive value above and beyond just "insufficiency" to warrant all the work and effort to add to the language.

CyrusNajmabadi commented 6 years ago

This whole issues is rather simple IMHO - we either make it difficult to break out of nested loops (by forcing developers to use state variables and so on) or we make it extremely easy and intuitive and support break

Sure. And most people are absolutely fine with the former option. It's similar to every 'deficiency' in every language, where people are satisfied with the SOP of just using other available language constructs to address the issue :)

gafter commented 6 years ago

@Joe4evr - Introducing state data (found) for the sole purpose of influencing execution flow is surely an admission that the language constructs available to us are insufficient?

Restricting yourself to a subset of the language that excludes goto will enable you to demonstrate that subset to be insufficient.

Mafii commented 6 years ago

@Korporal

The existing break keyword can be read as: break current; where current is an imagined label on the current loop so would you advocate that developers avoid using breakor continue?

Well there are situations where they can be used, but again, you should avoid nested loops by all cost, and the few cases where labelled loops might actually add a bit of readability to the language (and even that is subjective, I don't find them more readable than goto's in these 1 in a million cases where refactoring is out of the discussion) just don't warrant a change, as it doesn't add enough value to the language, as at @HaloFour said.

It's not about the keyword. It's about complexity. And having nested loops doesn't make your code easier to read, so having even more paths instead of refactoring doens't help really, it's fighting symptoms.

Korporal commented 6 years ago

@CyrusNajmabadi - You wrote:

Sure. And most people are absolutely fine with the former option.

How did you establish this?

Korporal commented 6 years ago

@Mafii - Yet the C# language expressly permits nested loops. Surely if there was merit in your claim then the initial language designers would have taken steps to at least issue a compiler warning when it encounters a nested loop?

So even though you might be right about nested loops being "avoided at all costs" (and that is far from established) the fact remains they are supported and permitted by the language grammar and so developers will write them.

Your argument is against nested loops - which is fine, that might be a wise coding standard - but we are not discussing that we are discussing the labelling of loops and the option to use that label in a break statement given the fact that nested loops are fully supported (however undesirable some may consider them) it strikes me as an improvement to simplify the mechanism for terminating such loops.

HaloFour commented 6 years ago

@Korporal

And the language already offers a perfectly suitable way to both break from and continue arbitrarily nested loops. You just want a slightly different syntax for it, which effectively does the same thing but uses a different keyword (and moves the label around).

Korporal commented 6 years ago

While we're discussing the merits or want thereof of nested loops, it may be helpful to consider this example taken from this forum.

Nested Loops

struct index3d
{
  int i;
  int j;
  int k;
};

index3d findValue(int value, int*** array, index3d sizes)
{
  int i;
  int j;
  int k;
  for(i = 0; i < sizes.i; i++)
  {
    for(j = 0; j < sizes.j; j++)
    {
      for(k = 0; k < sizes.k; k++)
      {
        if (array[i][j][k] == value)
        {
          return {i, j, k};
        }
      }
    }
  }
  return {-1, -1, -1};
}

Not nested loops

struct index3d
{
  int i;
  int j;
  int k;
};

index3d findValue(int value, int*** array, index3d sizes)
{
  int i;
  for(i = 0; i < sizes.i * sizes.j * sizes.k; i++)
  {
    int a = i / (sizes.j * sizes.k);
    int b = (i - (a * sizes.j * sizes.k)) / (sizes.k);
    int c = i - (a * sizes.j * sizes.k) - (b * sizes.k);
    if (array[a][b][c] == value)
    {
      return {a, b, c};
    }
  }
  return {-1, -1, -1};
}

Now who here considers the latter to be simpler than the former? Which code fragment would be easier to analyze and debug if something were suspected as being wrong?

Korporal commented 6 years ago

@HaloFour - you wrote:

And the language already offers a perfectly suitable way to both break from and continue arbitrarily nested loops. You just want a slightly different syntax for it, which effectively does the same thing but uses a different keyword (and moves the label around).

Which is not a true statement Halo. I'm not endorsing merely a "slightly different" syntax at all, I'm endorsing a proposal that offers a different syntax and very specific semantic behavior too. Yes one can use gototo achieve this but gotodoesn't ensure that we only end the loop, it does more (or can) because the label could come after some other block of code that might execute if the loop ends "normally" and so on.

The break labelsupport is very specific to loops and leaves no doubt whatsoever as to how execution proceeds. It also reduces the scope for subtle bugs by absolutely guaranteeing that you do no more than exit the labelled loop.

My position is that since the language does fully support nested loop constructs then it is an improvement to openly acknowledge that fact and do what one can to reduce the scope for error or complexity for developers who elect to write such loops.

HaloFour commented 6 years ago

@Korporal

I'm not endorsing a "slightly different" syntax at all

You put the label in a different place and use a different keyword to achieve identical results. The LDM seems to think that this is perfectly sufficient for what should be a pretty rare situation.

Korporal commented 6 years ago

@HaloFour - If you truly consider the existing gotosupport to be semantically identical to the proposed break label support then I'm afraid we must continue to disagree on this.

Also a label is just that, an identifer for a statement. Simply because it can only currently be the target of a gotostatement does not means that a label can or should be regarded as inherently a goto"target". The gotokeyword has its semantics and an extended breakkeyword would likewise have its own different semantics.

Finally actively encouraging the use of gotoas many here are begs the question about just how rational are the "code quality" arguments some are putting forward to support their position!

HaloFour commented 6 years ago

@Korporal

If you truly consider the existing goto support to be semantically identical

No, but I do consider it to be a suitable workaround that looks very similar. I get that labeled loops offers stricter guide rails to help developers to not misplace the labels, but I don't find that to be that compelling of a reason to modify the language.

jez9999 commented 6 years ago

@HaloFour

You put the label in a different place and use a different keyword to achieve identical results. The LDM seems to think that this is perfectly sufficient for what should be a pretty rare situation.

And returning tuples from methods is just a case of returning the appropriate Tuple class and consuming it from the other end, yet C# just introduced a shorthand for that.

HaloFour commented 6 years ago

@jez9999

And returning tuples from methods is just a case of returning the appropriate Tuple class and consuming it from the other end, yet C# just introduced a shorthand for that.

ValueTuple struct, and you can't get name support that way, but yeah, close enough. The difference being that the team wants to encourage use of tuples and that the syntax builds into positional patterns. But, more importantly, it's something that the team wanted to do. This isn't.

Korporal commented 6 years ago

If the decision has already been made to give no further consideration to adding break labelsupport then so be it.

CyrusNajmabadi commented 6 years ago

The break labelsupport is very specific to loops and leaves no doubt whatsoever as to how execution proceeds.

Sure. I've already acknowledged value in what it does above and beyond what you have in the language today. The question is: is that value worth the language change? IMO, no. It's extremely minor, and there are approaches you can take today that are effectively as simple to deal with this using existing code-flow constructs. Sure, they may not have as many guard-rails as the proposal you're making. But, IMO, they're sufficient. And i would rather stick with "sufficient" unless the language change is really substantively improving things. :)

Mafii commented 6 years ago

@jez9999

And returning tuples from methods is just a case of returning the appropriate Tuple class and consuming it from the other end, yet C# just introduced a shorthand for that.

Yes, but it's an approach the LDM wants to support, and advocate. It's a new approach, and having named returns and everything supports that approach. if it's easier to use, it's going to be used more often. No one is going to not use loops just because we don't have labeled loops. For the rest, I can't say it better than Cyrus. It's not about how big the change is, but about what it really adds to the language. And what it propagates for the developers.

Korporal commented 6 years ago

I have an idea, why not rename the C# language? In fact what about F#? that sounds like a good name to me?

Mafii commented 6 years ago

@Korporal no need to be sarky, that's not really constructive. Look, not everyone has the same opinions on coding style and features. But don't be insulting or offended because of it. Not gonna help you in any way. I do disagree with some changes of the C# language too, but I accept that some developers are more experienced/have different points of view and/or opinions.

Korporal commented 6 years ago

@Mafii - I wasn't being "snarky" I was actually expressing sarcasm which is different and a legitimate means of expressing ones self. The motive was the apparent desire by the LDT to adopt features more commonly found in functional languages into what is actually an imperative OO language.

Favoring the introduction of functional mechanisms over something as rudimentary and helpful as breaking out of a labelled loop seems - to me - quite laughable.