bazelbuild / starlark

Starlark Language
Apache License 2.0
2.43k stars 160 forks source link

Spec incorrectly claims that predeclared bindings cannot be reassigned #282

Closed haberman closed 1 month ago

haberman commented 1 month ago

In the Name binding and variables section, it says:

The set of predeclared names includes the universal constant values None, True, and False, and various built-in functions such as len and list; these functions are immutable and stateless. [...] Starlark programs cannot change the set of predeclared bindings or assign new values to them.

However both Bazel and https://github.com/google/starlark-go allow reassignment of True:

True = 3
print(True)

I initially considered this to be a bug in the implementations rather than the spec, since it seems like a good idea to prevent rebinding of the predeclared names. But then I realized that this would make it impossible to add new predeclared names in a backwards-compatible way. So perhaps this is a necessary evil.

laurentlb commented 1 month ago

There is shadowing. Values in the "predeclared" block can be shadowed by values in the module or file block. It's all resolved statically (unlike Python).

Indeed, this was designed with backwards compatibility in mind.

haberman commented 1 month ago

Ah, my bad. I can see how this is considered to be shadowing rather than rebinding. A note to clarify could help, but I see now that this is not a bug in the spec.

laurentlb commented 1 month ago

The spec mentions later:

The outermost block of the Starlark environment is known as the "predeclared" block. It defines a number of fundamental values and functions needed by all Starlark programs, such as None, True, False, and len, and possibly additional application-specific names.

These names are not reserved words so Starlark programs are free to redefine them in a smaller block such as a function body or even at the top level of a module. However, doing so may be confusing to the reader. Nonetheless, this rule permits names to be added to the predeclared block in later versions of the language (or application-specific dialect) without breaking existing programs.

But I agree that the first quote can cause confusion.

On a side-note, this code shows the difference with Python:

print(len)
len = 2

Python3 will print <built-in function len> while Starlark has an error before the evaluation (which proves that we are not modifying len).

haberman commented 1 month ago

Thanks for the explanation.

I found another issue -- running it by you here before I open another issue, in case I'm misunderstanding again.

The spec says:

The sets of names bound in the file block and in the module block do not overlap: it is an error for a load statement to bind the name of a global, or for a top-level statement to bind a name bound by a load statement.

I take this to mean that the following Starlark should be rejected:

# In a .bzl file, this code is rejected.
# In a BUILD file, it prints "3"
load(":test.bzl", "a")
a = 3
print(a)

In my testing, I find that this fragment is rejected in a .bzl file (good), but accepted in a BUILD file. This seems contrary to the spec. Am I missing something?

laurentlb commented 1 month ago

Correct, this should be rejected.

When we decided on this behavior, there were many existing BUILD files, so we added an exception for them and postponed the migration. Ideally, a migration flag should be added to Bazel and later switched.

It's worth filing an issue on Bazel.

haberman commented 1 month ago

That makes sense.

I also notice that the spec makes no mention of the leading-underscore-is-private convention used by Bazel. I guess this is already mentioned in https://github.com/bazelbuild/starlark/issues/37.

laurentlb commented 1 month ago

See https://github.com/bazelbuild/starlark/blob/master/spec.md#load-statements

The literal string ("x"), which must denote a valid identifier not starting with _, specifies the name to extract from the loaded module. In effect, names starting with _ are not exported.