gpuweb / gpuweb

Where the GPU for the Web work happens!
https://webgpu.io
Other
4.83k stars 319 forks source link

[wgsl] Comma operator #581

Open dj2 opened 4 years ago

dj2 commented 4 years ago

In WGSL the comma is not used to join multiple expressions together. Statements that have multiple expressions separated with commas can often be written in a different way without the comma operator.

My preference is to leave comma separated expressions out of the language. This has consequences for #579 and #569.

othermaciej commented 4 years ago

Is there a reason for the preference? Seems like not having some form of comma makes it difficult to make nice syntax for loops. That needs to be outweighed by some concrete downside, not just bare preference.

dj2 commented 4 years ago

Because I think this is going to open up a bunch of other syntax we could allow but I don't think we should. Things like var a,b,c : i32 = 2, 4, 5;. I'd prefer to just not have comma operator to start.

othermaciej commented 4 years ago

Supporting multi-variable initializers is not necessarily implied by support for comma expressions or comma statements. I don’t think the slippery slope argument is a valid reason to exclude something from the language.

(Also I don’t think the way you wrote it is a good way to do multi-initialization syntax anyway, even if we had it.)

litherum commented 4 years ago

C-family languages do support the comma, but don't support the syntax you describe in https://github.com/gpuweb/gpuweb/issues/581#issuecomment-592989791. This is an existence proof that we could potentially have a "comma operator with moderation."

I actually prefer the comma solution to https://github.com/gpuweb/gpuweb/issues/579. It's 1) familiar to C/C++ programmers, 2) gets rid of continuing, and 3) a faithful match to the original SPIR-V source. I don't think any other solution hits all 3 of these.

othermaciej commented 4 years ago

In fairness, C does have multi-initialization, but with different syntax: int x = 3, y = 4;. This isn't an instance of the comma operator though. Declaration statements don't accept an arbitrary expression, they are a unique syntax. A syntax which does not fit very well with WGLS's type declarations, though (and the cited multi-initialization syntax looks strange and would probably not be a good idea; it doesn't align with either statement comma or expression comma).

dj2 commented 4 years ago

I don't think because C/C++ have it is sufficient reason to add something to WGSL. Once it's added we can't take it away. So, we need a very good reason to add the construct to the language.

othermaciej commented 4 years ago

The reason to add it is reasonable loop syntax (as mentioned in the issues about adding good loop syntax). This seems like a very good reason.

C/C++ are only cited as counter-examples to your suggestion that if we add it, it might lead to things like var a,b,c : i32 = 2, 4, 5;.

dj2 commented 4 years ago

I think we should leave out comma until we have requests for it. You can still have for syntax with only single expressions and I think that would be sufficient to start. If you want more complicated expressions, the loop syntax is still available.

dneto0 commented 4 years ago

Familiarity with C is not a goal.

The comma operator adds unnecessary complexity. (Design DRY. See https://extensiblewebmanifesto.org/)

As for multi-variable updates in loops, I refer you to my comment at https://github.com/gpuweb/gpuweb/issues/579#issuecomment-596039499

The only reason you're seeing complexity is because your insistence on trying to make WGSL more like C.

kainino0x commented 4 years ago

Not x, y = 3, 4, but C's x = 3, y = 4; (as its own line) is an example of this comma operator. Would we want to allow that? (It's the same syntax that would be used in the for loop, just in a slightly different position.)

I want to leave the feature out completely, as having it at all adds complexity, and I really feel it will not be used very much. In many loops, it's perfectly valid to do it at the end of the loop body instead of in the continuing block, and for the ones where that's not valid, people can use a continuing block (or just put the extra statement before every continue in the loop, of which I would not expect there to be many).

othermaciej commented 4 years ago

Familiarity with C is not a goal.

No one cited it as a goal in this issue. Who are you rebutting?

The only reason you're seeing complexity is because your insistence on trying to make WGSL more like C.

Who is the "you" here? Please don't impute motivations to others. Either quote people or give your own thoughts on things.

Comma operator is a quantum of additional complexity, but if it lets us avoid the need loop/continue, it may produce a net reduction in complexity. And better to spend complexity points on familiarity than on weirdness. There's all sorts of familiar loop syntax but of all the ones I could think of, none has a loop syntax like loop/continue/break as the fundamental loop. That syntax is just syntactic salt on top of a C-style for loop.

I think we should discuss loop syntax in the issue about it. This comma side issue just makes things more heated, because comma's primary value would be to provide nicer loop syntax, and debating its standalone value generates more heat than light.

kainino0x commented 4 years ago

Just for some context for myself (I've been out so I'm super behind). Is the idea to both add the comma operator and remove continuing entirely? If not removing continuing, it doesn't reduce implementation complexity, only adds convenience (which I argue is minor).

othermaciej commented 4 years ago

I don't know what other folks think, but what I'd propose would be:

Then we can talk about whether while should be added as syntactic sugar on top of for. (The sugar can't be the other way around because the language has a continue statement but lacks goto, so this is the only direction that works without potentially needing to duplicate the loop condition.) I'm not sure how to implement do/while on top of any of the other loops without adding auxiliary variables or duplicating the loop body, though.

Regrettably, this discussion is spread among three different issues. In addition to this one about the comma operator, there is https://github.com/gpuweb/gpuweb/issues/569 about adding friendlier loop syntax (it proposes adding it as syntactic sugar on top of loop, but I'd more aggressively propose replacing loop entirely); and https://github.com/gpuweb/gpuweb/issues/579 about the continuing block in particular and alternatives to it.

dj2 commented 4 years ago

We don't need all the looping constructs you find in a C like language. As @damyanp mentioned in https://github.com/gpuweb/gpuweb/issues/569#issuecomment-593630597, the more we add the easier it is to have divergent implementations.

We want the language to be as small as possible, as stated in our goals. Part of that, in my opinion, is not adding 4 different looping constructs.

I'd prefer just loop/continuing but could see adding a simple for which only works with a single variable, no comma operator. The loop/continuing construct stands on its own without needing to introduce a comma operator and without having to have comma accept any block.

As @jdashg said in https://github.com/gpuweb/gpuweb/issues/569#issuecomment-593271924 theloop/continuing is verbose but works.

othermaciej commented 4 years ago

for with multi-statement iterator seems simpler to me than loop/continuing + single statement for.

"It's verbose but it works" is not exactly a ringing endorsement compared to "It's not verbose and it works".

I am puzzled at all the worry about divergence. If there's multiple loop types but some can be defined by desugaring, that is easy to implement and test. That's certainly the experience with JavaScript features that have simple desugaring. A test suite will also greatly aid interop. I'm really doubting that any implementation will screw up translating a while loop to a for loop. (Or a for loop into a loop loop if we decide to do it that way.)

othermaciej commented 4 years ago

BTW after thinking about it, it's clear to me how to desugar do/while to a for loop with the use of break, or a while loop or for that matter a loop loop so we could treat any of these as fundamental. But I'd also buy the argument that this isn't the most essential of loop types.

dj2 commented 4 years ago

Using for means having a comma operator which accepts blocks, this is more complex to me and has the potential for knock on effects.

Just having a loop/continuing is also fine, we don't need to add single statement for. Having just for means we have to add multi statement comma which accepts blocks. This then raises questions of where else comma can be used and why it isn't used there. loop/continuing is self contained.

othermaciej commented 4 years ago

for that accepts statements or blocks (without a comma, just surrounded by braces) is no more complex than continuing, which also accepts a block surrounded by braces.

In general, accepting a block anywhere that a statement could appear makes the language more orthogonal.

It feels like there's a lot of anchoring on the prototype version of the language. If we are successfully, there will be a lot of people reading and writing the language who are not used to reading SPIR-V assembly. There should be openness to evolving the surface syntax to address the syntax elements that have most confused people. There is a very small set of these things and it seems like they are all getting immediate strong pushback.

dj2 commented 4 years ago

I believe writing:

for (var i : i32 = 0; i < 10; {
  if (i > 10) {
    i = i - 5;
  } else {
    for (var y: i32 = 0; y < 5;  y = y + 1;) {
   }
}) {
}

to be more difficult to read then:

var i : i32 = 0;
loop {
   break unless (i < 10);

  continuing {
    if (i > 5) {
      i = i - 5;
    } else {
     var y : i32 = 0;
     loop {
       break unless (y < 5);

       continuing {
         y =  y + 1; 
     }
  }
}

Please don't assume we're anchoring on the prototype version. We've been thinking a lot about how to represent the more complex SPIR-V continuing block. We've gone through the various c based looping idioms and they all end up being, weird, when trying to shoehorn in the complex continuing.

The loop/continuing has confused some people, and been fine for others. It's different. There are going to be other parts of the language which are different. You need to learn the language, not assume it's C or C++ or HLSL or MSL or anything else.

othermaciej commented 4 years ago

How often does a construct like that come up in SPIR-V, and what is an example of of shader source code that would lead to it? Obviously no human would write either of those examples.

Note also that your example can be written with no use of continuing or compound for Increment, as it does not contain a continue. You’ve put the whole loop body inside the continuing block for no apparent reason, making both versions look more complicated than they have to.

othermaciej commented 4 years ago

Also, statements like this are needlessly condescending:

You need to learn the language, not assume it's C or C++ or HLSL or MSL or anything else.

Since we are still defining the language, it doesn’t seem appropriate or relevant to say “you need to learn the language” in response to proposals to improve the clarity of the language.

dj2 commented 4 years ago

I left out the middle of the loop because I didn't think it was relevant. Assume there is a continue in there.

Coming from SPIR-V, everything gets inlined, so if there was a function call in your continuing, it's inlined.

dj2 commented 4 years ago

I'm not trying to be condescending and I apologize.

kainino0x commented 4 years ago

You need to learn the language, not assume it's C or C++ or HLSL or MSL or anything else.

FWIW, I read this as the generic "you": "We are defining a new language, and people who develop in it will need to learn it, not assume...". I didn't read it as "Maciej, you need to learn the language, not assume...". The former is a valid opinion about how to go about designing the language.

othermaciej commented 4 years ago

I think the product of aggressive inlining is inevitably going to look weird. Let's consider a more slightly more natural handwritten example, reversing a buffer in place. (Ignore for the moment whether this is actually the most efficient way to write this).

Let's assume some initialization.

var len : i32 = ...; // some runtime computation
var buffer : array<int32, len>;
// initialize buffer here

Now the loop version of the loop:

var i : i32 = 0;
var j : i32 = len - 1;
loop {
    break if !(i < j);
    if (buffer[src] = buffer[dst])
        continue;
    var tmp : i32 = buffer[src];
    buffer[src] = buffer[dst];
    buffer[dst] = tmp;
    continuing {
        i = i + 1;
        j = j + 1;
    }
}

With for and blocks, this would be:

for ({var i : i32 = 0; var j : i32 = len - 1;} i < j; { i = i + 1; j = j + i;}) {
    if (buffer[src] = buffer[dst])
        continue;
    var tmp : i32 = buffer[src];
    buffer[src] = buffer[dst];
    buffer[dst] = tmp;
}

(This is assuming some magic so that the special blocks are the same lexical scope, otherwise at least one of i or j has to be initialized outside the loop.)

With comma:

for (var i : i32 = 0, var j : i32 = len - 1; i < j; i = i + 1, j = j + i}) {
    if (buffer[src] = buffer[dst])
        continue;
    var tmp : i32 = buffer[src];
    buffer[src] = buffer[dst];
    buffer[dst] = tmp;
}

The for versions are more concise and easier to understand without the need for study and thought. Would be even more concise with a += or ++ shortcut operator, type inference, and multiple declarations per var, but those things can all be discussed separately.

othermaciej commented 4 years ago

FWIW, I read this as the generic "you": "We are defining a new language, and people who develop in it will need to learn it, not assume...".

If that's how it was intended, then ok, not condescending. But I think it's still not a relevant response to proposals to make the learning curve easier. We are now in the process of defining the language. "people will have to learn", yes, but we can make it easier or harder.

kdashg commented 4 years ago

If continue/continuing is just syntactic sugar, we can desugar as:

var i : i32 = 0;
var j : i32 = len - 1;
loop {
    if (!i < j) break;
    if (!(buffer[src] = buffer[dst])) {
        var tmp : i32 = buffer[src];
        buffer[src] = buffer[dst];
        buffer[dst] = tmp;
    }
    i = i + 1;
    j = j + 1;
}

This seems like the most primitive way to express it, and I don't think it's bad. This is actually how I'd write it preferentially, instead of using the IMO hard-to-read for-with-comma-operator.

I think comma is outside of MVP.

othermaciej commented 4 years ago

loop-without-continuing avoids having a confusing construct, but it also makes implementing loops really grindy. The example above loops like assembly language. There is no need to force this level of grind on authors, or to make even carefully handwritten code harder to reason about.

kainino0x commented 4 years ago

After thinking about it for a bit, I don't think continue is just syntactic sugar. I couldn't figure out how to desugar this without changing the CFG.

loop {
  if (x) {
    a;
    if (y) {
      continue;
    }
    b;
  }
  continuing {
    c;
  }
}
kdashg commented 4 years ago
loop {
  if (x) {
    a;
    if (!y) {
      b;
    }
  }
  c;
}
kdashg commented 4 years ago

I'm fine with complicated loops being a little grindy in WGSL.

kainino0x commented 4 years ago

I'm fine with complicated loops being a little grindy in WGSL.

Same as long as they're not common (and I think they're not common especially in graphics and math, which is most shaders).


Right. I had drawn out the graph on paper and it actually was more like this (with d;).

loop {
  if (x) {
    a;
    if (y) {
      continue;
    }
    b;
  }
  d;
  continuing {
    c;
  }
}
kdashg commented 4 years ago

Yeah, I think that might need break N: (Or it would if it had any breaks to end the loop)

loop {
  loop {
    if (x) {
      a;
      if (y) {
        break 1;
      }
      b;
    }
    d;
  }
  c;
}

Now, this is convoluted to begin with, and I don't think it's too much worse without continue.

In general:

loop {
  A
  continuing {
    B
  }
}

Becomes

loop {
  loop {
    A.replaceAll('break;', 'break 2;').replaceAll('continue;', 'break 1;')
  }
  B
}
othermaciej commented 4 years ago

If shader languages compiled to SPIR-V produce complex loops with any regularity, I'd expect WGSL authors would also want to use complex loops often enough to matter.

A lot of folks have said that they think compound increment steps in a for loop are very rare. I decided to do a quick sanity check against a significant C++ code base, WebKit/Source/WebCore. I am not claiming it's representative; it was just easy for me to grep.

WebCore has:

A number of the loops with multi-part increment are in graphics code, a few handpicked samples:

WebCore/platform/mac/PasteboardMac.mm:        for (int i = 0; i < imageSpec->pixelsHigh / 2; ++i, topRow += imageSpec->bytesPerRow, botRow -= imageSpec->bytesPerRow) {
...
WebCore/platform/image-decoders/png/PNGImageDecoder.cpp:        for (int x = 0; x < width; ++x, pixel += 4, ++address) {
...
WebCore/platform/image-decoders/jpeg/JPEGImageDecoder.cpp:        for (unsigned i = 0; i < tagCount && end - ifd >= ifdEntrySize; ++i, ifd += ifdEntrySize) {
...
WebCore/platform/graphics/ShadowBlur.cpp:                for (i = 1; i < limit; ++i, prev += stride)

A more valid approach would be to search a large corpus of shaders. But I hope this demonstrates that compound increments are a normal (if not super frequent) idiom, and at least suggests that it can be particularly interesting in graphics code messing with buffer.

Does that mean we super need it? No. But if it also means we get to drop loop/continue (or loop/multi-level break) and have for as a fundamental loop type, this seems at least worthy of consideration. It will reduce weirdness, and unify human-writable and machine-generated faces of the language.

dneto0 commented 4 years ago

This issue has drifted from being about the comma operator and instead is about loops again, which is #579. That makes it hard to follow and probably harder to converge.

dneto0 commented 4 years ago

I actually prefer the comma solution to #579. It's 1) familiar to C/C++ programmers, 2) gets rid of continuing, and 3) a faithful match to the original SPIR-V source. I don't think any other solution hits all 3 of these.

I agree on 1, disagree on 2 (because of the complex-CFG-in-the-continuing-block case), and disagree on 3. By "disagree on X" I mean it's objectively false.

litherum commented 4 years ago

We should talk about this on the call today. I don’t understand why it’s objectively false. The proposal is not for an ordinary C-style comma; it’s for a supercharged statement-separator comma.

grorg commented 4 years ago

Discussed at 2020-03-10 Teleconference

kainino0x commented 4 years ago

If shader languages compiled to SPIR-V produce complex loops with any regularity, I'd expect WGSL authors would also want to use complex loops often enough to matter.

I believe the ~only time a SPIR-V shader would have a very complex continuing block is due to the inlining of a function in the loop-continue of (e.g.) a GLSL C-style for-loop. WGSL authors would not typically perform that inlining when writing code from scratch.

A number of the loops with multi-part increment are in graphics code, a few handpicked samples:

Unlike C++ graphics code, shaders do not commonly loop over pixels, because they are already parallelized over pixels. That said, it's still an interesting investigation. Can you comment on how many of those cases need a compound loop-continue and couldn't just put one or both of the statements at the end of the body? (i.e. how many of the loops use continue)

But if it also means we get to drop loop/continue (or loop/multi-level break)

Comma does not allow us to drop loop/continuing. We would have to allow a full-on block for(;;in this position). Since that would be invented syntax, I do not think it's better than the loop/continuing invented syntax (which has a little bit more precedent in Turing).

kainino0x commented 4 years ago

Relevant to the last reply above ^

The proposal is not for an ordinary C-style comma; it’s for a supercharged statement-separator comma.

Blocks have to be accounted for here (whether by calling blocks statements or otherwise).

dneto0 commented 3 years ago

To close the loop here:

@litherum asked

We should talk about this on the call today. I don’t understand why it’s objectively false.

@kainino0x answered it in the first part of his comment of March 10

Recommend closing this.

kdashg commented 3 years ago
WGSL meeting minutes 2021-07-13 * (close? Move to post-mvp?) * DN: Close! * RM/MM: Don’t close! * RM: Let’s postpone past mvp. * MM: Strongest arg from our perspective is using the comma op to avoid having to use the continuing block. There are other args, but that’s the most important one. * (postponed) * DN: It’d rather have something like a lambda block. * MM: Lambdas are good too.
kainino0x commented 3 years ago

I really, really think this should be closed. The comma operator is not a solution to the problem of including more in the for-continue block, as it works on expressions, and assignments are not expressions in WGSL. Comma operator would only help with this if we made assignments expressions, and I don't think there is a huge appetite to do so.

Instead, I would simply add a comma-separation syntax to the grammar of the for-loop itself.

ben-clayton commented 3 years ago

One alternative for the for-loop case is to simply allow block statements in the initializer and continuing:

for ({var i : i32; let n = 10;}; i < n; i = i + 1) {
}
for (var i : i32; i < 10; {doSomething(i); i = i + 1;}) {
}
kainino0x commented 3 years ago

I think that's an alternate syntax we considered instead of the continuing block, but I don't think it's worth more than the continuing block we already have. Seems to me either we allow something that looks like a comma operator so the syntax looks like C, or we just have people use continuing.

Kangz commented 2 years ago

Can this be closed, or at least moved post-V1?

kvark commented 2 years ago

I think we can consider the train to be gone for this at least in terms of V1.