google / starlark-go

Starlark in Go: the Starlark configuration language, implemented in Go
BSD 3-Clause "New" or "Revised" License
2.31k stars 209 forks source link

compile: don't optimize (a+"b")+"c" to (a+"bc") if we don't know the type of a #213

Open oprypin opened 5 years ago

oprypin commented 5 years ago

I run this snippet:

print( ("a" + ["b"]) + ["c"] )

And Starlark says:

unknown binary op: "a" + ["b", "c"]

But it wasn't supposed to get to the value ["b", "c"] according to the brackets!

To get the error message above (not just types) I made this modification:

--- a/starlark/eval.go
+++ b/starlark/eval.go
@@ -950,7 +950,7 @@ func Binary(op syntax.Token, x, y Value)

        // unsupported operand types
 unknown:
-       return nil, fmt.Errorf("unknown binary op: %s %s %s", x.Type(), op, y.Type())
+       return nil, fmt.Errorf("unknown binary op: %s %s %s", x, op, y)
 }

 // It's always possible to overeat in small bites but we'll
alandonovan commented 5 years ago

The compiler performs an optimization that transforms an expression of the form a + b + ... + z (where b through z all have the same "addable" type---one of int literal, string literal, tuple expression, or list expression) into a + k where k is the statically computed sum or concatenation of b + ... + z.

This optimization does assume that the leftmost operand's definition of add is associative, which is true for all core types and for all sensible user-defined types. But that should probably be documented at HasBinary.

oprypin commented 5 years ago

OK but how is it allowed to transform (a + b) + c into a + (b + c)? I have explicitly chosen this prioritization and order of operations, surely it can't just directly go against that?

The issue is not about the plain a + b + c, I figured that's what was happening.

BTW, in Bazel these two produce different outputs: print( select({"a": ["b"]}) + ["c"] + ["d"] ) print( select({"a": ["b"]}) + (["c"] + ["d"]) )

alandonovan commented 5 years ago

OK but how is it allowed to transform (a + b) + c into a + (b + c)? I have explicitly chosen this prioritization and order of operations, surely it can't just directly go against that?

A compiler can do whatever it likes so long as you can't tell.

BTW, in Bazel these two produce different outputs: print( select({"a": ["b"]}) + ["c"] + ["d"] ) print( select({"a": ["b"]}) + (["c"] + ["d"]) )

You're correct that they are represented differently, and also that the str(x) output reflects the representation, so the optimization is observable, so strictly speaking this is a bug. But aside from their string forms, the two select values are otherwise indistinguishable because selects behave associatively, so you would still get the same Bazel build no matter which one you used.