bazelbuild / starlark

Starlark Language
Apache License 2.0
2.44k stars 160 forks source link

Return views instead of lists in various operations #203

Open brandjon opened 3 years ago

brandjon commented 3 years ago

Examples include dict.keys, dict.values, dict.items, enumerate, and zip. These operations return lists like in Python 2, but they could be made to return views like in Python 3.

Today, the fact that they return lists means that subscripting with integers is possible, e.g. mydict.keys()[0] to get the first key in the iteration over mydict. This is indeed allowed in Python 2 and not 3. We could make an incompatible change to prohibit this kind of operation. Alternatively, since Starlark has a well-defined iteration order for dictionaries, we could continue to support indexing, though it might not be constant time.

This issue came up during discussion of how to implement efficient views for Bazel's native.existing_rules operation. @tetromino

stepancheg commented 3 years ago

Please elaborate:

for k in d.keys():
  d[1] = 2

Would that be allowed? I. e. iteration over keys should:

laurentlb commented 3 years ago

fyi, the range function returns a view (not a list). You can do the same for native.existing_rules.

tetromino commented 3 years ago

Re @stepancheg - if we switch to views, I would suggest not allowing integer indexing of such views (e.g. no d.keys()[1], like in Python 3). So this would be an incompatible change.

On the other hand, we do want to optimize membership checking (e.g. we want if "foo" in d.keys() to be O(1) time).

ndmitchell commented 3 years ago

@tetromino note that @stepancheg is asking a different question - if you are iterating over a view, can you modify the underlying object? With views, you might expect the answer is no. Without views, you'd expect it to be yes.

tetromino commented 3 years ago

Ah, thanks for the explanation. I don't think that there is any reasonable way to support modifying the set of keys of the underlying dict. (And prohibiting it would mirror python3 behavior, where adding or deleting a dict's key during iteration throws a RuntimeError.)

I don't have a strong opinion about allowing modifying the values. But it would be pretty easy to support value modifications imho (maintain a modification counter in the dict, copy the counter into the view; if the counters don't match, the view self-materializes into a list).

tetromino commented 3 years ago

@brandjon pointed out an implementation concern: once the dict is frozen, the view might be used from multiple starlark threads, so materializing the a view into a list would need to be done in a thread-safe manner.

haxorz commented 9 months ago

I just came across some bzl code in Google's codebase that was running into this performance issue. Working around the issue reduced the package loading wall time of a single extreme package (for which this issue was a hotspot) from ~115s to ~16s!