ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.89k stars 2.48k forks source link

Zig's += operator evaluates differently than C's #9386

Open floooh opened 3 years ago

floooh commented 3 years ago

Not sure if this is a bug or feature, I just stumbled over this while moving some C code over to Zig and this was a bit of a head scratcher.

Basically, if you have a statement like this:

bla.val += func(&bla);

...then Zig seems to first "capture" the value of bla.val before calling func() and then doing the addition, while C first seems to call func and then perform the addition. This leads to different behaviour if func() has the side effect of also modifying bla.val.

Godbolt example in Zig:

https://www.godbolt.org/z/E6xbd3839

...and in C:

https://www.godbolt.org/z/jvvr8ne5q

The effect can be simulated in C by "unfolding" the expression (although I suspect that this actually depends on unspecified behaviour in C regarding the evaluation order):

// this behaves like Zig's +=
bla.val = bla.val + func(&bla);
// this behaves like C's +=
bla.val = func(&bla) + bla.val;

IMHO Zig should behave like C unless of course there's a reason not to, such as hidden gotchas in C that should be fixed (like the unspecified evaluation order, the less unspecified behaviour in Zig the better).

Also see this twitter thread (there's some interesting info about C# behaviour in there):

https://twitter.com/FlohOfWoe/status/1415385408771399680

bfredl commented 3 years ago

bla.val += func(&bla);

if func modifes bla.val isn't this a concurrent modification within the same sequence point, and thus not well defined by the C standard in the first place?

floooh commented 3 years ago

@bfredl, not sure, I'm not much of a language lawyer, but at least GCC, Clang and MSVC all agree on the same behaviour :)

bfredl commented 3 years ago

but at least GCC, Clang and MSVC all agree on the same behaviour :)

For all cases of the this pattern with optimization enabled or this one example in debug mode? :)

At least the draft c99 standard says E1 += E2 should have the same semantics as E1 = E1 + E2 (after lvalue evaluation) so it is at least unspecified behavior, as you mentioned.

floooh commented 3 years ago

The C code has been working for the last few years both with optimization enabled (-Os and -O3) and disabled across GCC, Clang and MSCV and 32- and 64-bit x86, 64-bit ARM and WASM (and the highest warning levels). So even if it's unspecified in the standard, the compiler vendors seem to have agreed on one common behaviour. But I agree that it would be a good thing if such things are more clearly defined in Zig.

mikdusan commented 3 years ago

for convenience this is C2x (consistent with C17 and C11):

image

and this answer from stackoverflow

floooh commented 3 years ago

Would probably make sense to also look in detail at C's += specification separately, this is under "6.5.16 Assignment Operators". For instance, whether the right side is evaluated first before the += takes place?

There's this blurb, but my "spec-foo" isn't quite good enough to decipher :)

Screen Shot 2021-07-16 at 10 26 30 AM
martinhath commented 3 years ago

I'm fairly sure this is straight up UB in C(++), but just to point out another thing,

// this behaves like Zig's +=
bla.val = bla.val + func(&bla);
// this behaves like C's +=
bla.val = func(&bla) + bla.val;

Both of these are the same in C(++), and probably also Zig(?), because there's no guarantee of left-to-right evaluation of subexpressions. I think you meant

// this behaves like Zig's +=
var t0 = bla.val;
var t1 = func(&bla);
bla.val = t0 + t1
// this behaves like C's +=
int t0 = func(&bla);
int t1 = bla.val;
bla.val = t0 + t1;

or something along those lines, which is properly defined and all is well in either language :smile:

Validark commented 2 years ago

"with respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation" I could be wrong but that sounds like:

// (shortened, assume the address of bla.val wasn't changed by func, see the pic below)
int t0 = func(&bla);
bla.val += t0; // is this a "single evaluation"??

Here's the full context from here image

Another clue is that the atomic version explicitly evaluates E2 before grabbing the value in addr, the saved address of E1. Looks like defined behavior to me, but I've never looked at this spec before so I haven't read the working definitions used here.

sharpobject commented 2 years ago

Some people interpret "with respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation" instead to apply to situations like z = foo() + (x += y), where it would mean "the x += y will not be interleaved with the execution of foo(), it will happen entirely before or entirely after foo(), which would not be guaranteed if you wrote z = foo() + (x = x + y)."

Unfortunately this interpretation doesn't say anything about the topic of this issue.