ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.

https://ziglang.org

MIT License

34.83k stars 2.55k forks source link

Allow `break` and `continue` in inline loops #9524

Open Exonorid opened 3 years ago

Exonorid commented 3 years ago

Currently, trying to compile something like

pub fn main() void {
    const Foo = struct{a: ?u8, b: ?u8, c: ?u8, d: ?u8};

    var test_foo = Foo{.a = 5, .b = 6, .c = null, .d = 9};
    inline for(std.meta.fields(Foo)) |field_info| {
        if(@field(test_foo, field_info.name) == null) continue;
        doStuff(@field(test_foo, field_info.name));
    }
}

won't work because the continue is based on a runtime value. However, there's no reason not to allow this behavior at runtime by unwrapping it into something like

if(test_foo.a == null) goto for_b;
doStuff(test_foo.a);
for_b:
if(test_foo.b == null) goto for_c;
doStuff(test_foo.b);
//etc

Although I will admit that this example is contrived, I just wanted a simple example to illustrate my point. For more complicated control flow, it can greatly simplify code and make it more readable.

Exonorid commented 3 years ago

Sorry about the title, I wasn't paying attention

marler8997 commented 3 years ago

I think this breaks Zig. In your case, doStuff looks like it might be a runtime function call, but if it's a comptime function call, then whether or not it gets called depends on the result of the if/continue, which it can't determine at comptime. I suppose if everything following the continue was runtime only, maybe it could work?

ghost commented 3 years ago

In your case, doStuff looks like it might be a runtime function call, but if it's a comptime function call, then whether or not it gets called depends on the result of the if/continue, which it can't determine at comptime.

That looks reasonably well-defined to me. If doStuff is runtime, it is left as is. If it's comptime, and the condition guarding it cannot be evaluated, then doStuff is always run. It is only elided if it can be shown to be definitely unreachable. That's how it works in other cases as well:

const a = false;
if (a) @compileError("!"); // elided

var b = false;
if (b) @compileError("!"); // throws

That said, I lean towards keeping the "unroll fully or error" semantics of inline loops, as opposed to "unroll what you can". Otherwise, nothing would prevent you from accidentally generating a bunch of unwanted runtime checks, because you forgot to mark some var as comptime, or call a function that wasn't quite as comptime as expected.

AssortedFantasy commented 3 years ago

I think this breaks Zig.

Not really, the semantics of this seem pretty clear cut. You can currently work around what is probably a bug like this:

inline for (something) |stuff| {
  // expressions
  if (!condition) continue;
  // more expressions
}
// exactly equivalent to <=>
inline for (something) |stuff| {
  // expressions
  if (condition) {
    // more expressions
  }
}

inline for (something) |stuff| {
  // expressions
  if (!condition) break;
  // more expressions
}
// exactly equivalent to <=>
var is_breaked = false;
inline for (something) |stuff| {
  if (!is_breaked) {
    // expressions
    if (condition) {
      // more expressions
    } else {
      is_breaked = true;
    }
  }
}

Not sure about the coden on the simulated "break". It'd be more effient with a single jump to the end rather than what naively codegens into a chain of compare and jumps against the same condition.

CurtisFenner commented 3 years ago

As a workaround that doesn't force you to rearrange how your ifs are structured, you can use two "runtime" blocks/loops sandwiching the inline loop to get break/continue

outer: {
    inline for (arr) |e| {
        inner: {
                ......
                outer: break; // equivalent to a "runtime break"
                ......
                inner: break; // equivalent to a "runtime continue"
                ......
        }
    }
}

however you'll probably run into #2727 which is apparently still open (I have not verified this recently).

Using conditional break and continue doesn't have to affect unrolling at all; a break is just an unconditional jump to the end of the outer block, and a continue is just an unconditional jump to the end of the current iteration block.

The problem arises when you're trying to make the generated code omit some portion of the iteration, and you mistakenly think that the if is comptime but have failed to include the appropriate comptime annotation in the condition. I'm not sure that kind of situation is that common. The main cost to messing this up is extra generated dead code, which is not so bad but obviously hard to notice (otherwise, you might suffer type-checking errors, but that's easier to diagnose).

However, this problem already exists with other conditional runtime logic -- like if (...runtime...) return, or labeled breaks like in the workaround. So, I'm not sure specifically targeting break and continue makes sense.

Is there currently a such thing as inline break/comptime break ? For the case where you want to be sure the unrolling is being clipped, you'd use inline/comptime, and otherwise you get the flexible intuitive behavior

ghost commented 3 years ago

2727, #1609 and a host of other issues are still open, so you cannot `break` out of inline loops reliably. stage1 just doesn't like runtime control flow in inline loops at all. However, for stage2, we should seriously think about what the desired semantics is. On the one hand, a break or continue that is incorrectly deferred to runtime is merely a performance issue. On the other, explicit unrolling for perfomance is one of the major uses of inline loops, so we shouldn't dismiss silent failures to optimize as unimportant.

There's also the question whether something like this should be allowed:

inline while (runtime_cond) {
    if (comptime_cond) break; // causes loop to terminate without exhausting branch quota
    // do something ...
}

After all, this is exactly equivalent to

inline while (true) {
    if (runtime_cond) break;
    if (comptime_cond) break;
    // do something ...
}

If the latter is permitted, why not the former?

Personally, I would prefer it if direct break and continue of an inline loop were only allowed within comptime-evaluated control paths. Breaking and continuing at runtime would require the use of labelled break as shown above. This way, we don't lose any functionality, but gain a visual distinction between runtime and comptime control flow, and also get error messages when something that's supposed to be comptime isn't.

In short:

outer: {
    inline while (cond1) inner: {
        if (cond2) break;
        if (cond3) continue;
        if (cond4) break :outer; // break equivalent
        if (cond5) break :inner; // continue equivalent
        if (cond6) return;
    )
}
// cond 1-3: comptime only
// cond 4-5: can be runtime

AssortedFantasy commented 3 years ago

@zzyxyzz ~~I don't think your reasoning is correct.~~

inline while (true)

isn't a thing, wont really compile since it requires an infinite branch quota. That's why the thing inside of the inline while(cond) has to be comptime known.

Inlines while don't have very complicated semantics and I don't think we need to make such complex conditions on break and continues. They should work fine at runtime or comptime. See my comment for the semantics.

The reason is because "inline while" basically is just a copy paste of a block some compile time known number of times. Thats it. A runtime break is just a jump to the end of the blocks, a runtime continue is just a jump to the end of the current copy.

A comptime break lets you avoid emitting (statically eliminates) the rest of the blocks, a comptime continue lets you avoid emitting the rest of the current block.

Edit: inline while (true) will work if you have a comptime break. Since then it can stop copy pasting more blocks. So I see the point of your comment a bit more, but I still disagree about them.

inline while (runtime_condition)

is distinctly different because you don't know if runtime condition is true or false, so you can't decide to emit a block, but you spoke about it as though it should be assumed true.

ghost commented 3 years ago

@AssortedFantasy I was just pointing out the equivalence between the loop condition and a conditional break in the loop body. So if we want the loop condition to be comptime-known, we probably intend breaks and continues to be comptime as well. And vice versa, if runtime breaks and continues are allowed in an otherwise comptime loop, then why not also in the loop condition, provided that the loop still terminates at comptime?

I'm also not arguing that the semantics of mixed runtime/comptime conditional breaks is somehow ill defined. It's just that what gets included in the final binary and what doesn't is very much part of the intent of an inline loop, and muddying this distinction is not a great thing to do, IMO.

AssortedFantasy commented 3 years ago

@zzyxyzz This is exactly what I disagree about. I don't think they are equivalent, hence we can have it both ways. I don't think just because runtime breaks work somehow implies the loop conditions are allowed to be runtime.

Specifically, I think this is poorly defined:

inline while (runtime_cond) { if (comptime_cond) break; // causes loop to terminate without exhausting branch quota // do something ... }

The reason I don't think this is well defined, is because why should this start unrolling in the first place? The compiler uses inline whiles to unroll statements. If you don't know runtime_cond at compile time, the compiler can't choose to either start unroll, or not start unrolling. Its not true or false.

inline while (runtime_cond) <=> inline while (true) // <-- this right here is what I disagree with, why do we assume true.

At compile time you need to decide if you unroll or not. If make it so that "runtime_conds" are assumed true then yeah, sure, everything becomes inconsistent, but if you don't have that assumption then nothing is muddied imo.

If its still not clear, the reason is because I think inline while loop conditions have one extra "reponsibility", they decide if you should unroll a block. Which has to be known at comptime, hence they have to be comptime. Runtime breaks don't. Really sorry if this repetition is coming off as condescending, I'm just trying to be clear since I often don't make the most sense.

ghost commented 3 years ago

@AssortedFantasy,

The reason I don't think this is well defined, is because why should this start unrolling in the first place? The compiler uses inline whiles to unroll statements. If you don't know runtime_cond at compile time, the compiler can't choose to either start unroll, or not start unrolling. Its not true or false.

The compiler starts unrolling because it sees the inline attribute. If the loop condition cannot be evaluated at comptime, a runtime conditional break is inserted at the start of every unrolled instance. Runtime-conditional breaks and continues within the body are handled in the same fashion.

The unrolling terminates either when the loop condition evaluates to false at comptime, or when a break is reached on a comptime control path within the body. If neither happens, the compiler eventually hits the branch quota and aborts with an error.

Thus, from a purely mechanical perspective, all combinations of runtime and comptime control flow in inline loops are well-defined and at least somewhat sensible. Whether they conform to expectations is another matter, though. Personally, I tend to think about loops in desugared terms, and therefore expect the loop condition and conditional breaks to be treated equally. Which does not make the expectations of others any less valid, of course.

AssortedFantasy commented 3 years ago

@zzyxyzz

If the loop condition cannot be evaluated at comptime, [we assume its true/ start unrolling anyway] and a runtime conditional break is inserted at the start of every unrolled instance. Runtime-conditional breaks and continues within the body are handled in the same fashion.

I suppose we will just have to agree to disagree then. This entire conversation has just been more or less just me repeating the same thing about how In my head this isn't expected / isn't related to runtime breaks and continues / isn't very sensible even if it is possible that we define inline while semantics like this.

If you'd like to talk more feel free to message me on the Zig discord as KingStnap. Otherwise we might as well not bloat this issue further with more of the same thing.