Open brandjon opened 3 years ago
Please elaborate:
for k in d.keys():
d[1] = 2
Would that be allowed? I. e. iteration over keys
should:
keys
is a sequence type of type which is not list, which may be implemented as COW but this would be an implementation detail, not part of the spec?fyi, the range
function returns a view (not a list). You can do the same for native.existing_rules
.
Re @stepancheg - if we switch to views, I would suggest not allowing integer indexing of such views (e.g. no d.keys()[1]
, like in Python 3). So this would be an incompatible change.
On the other hand, we do want to optimize membership checking (e.g. we want if "foo" in d.keys()
to be O(1) time).
@tetromino note that @stepancheg is asking a different question - if you are iterating over a view, can you modify the underlying object? With views, you might expect the answer is no. Without views, you'd expect it to be yes.
Ah, thanks for the explanation. I don't think that there is any reasonable way to support modifying the set of keys of the underlying dict. (And prohibiting it would mirror python3 behavior, where adding or deleting a dict's key during iteration throws a RuntimeError
.)
I don't have a strong opinion about allowing modifying the values. But it would be pretty easy to support value modifications imho (maintain a modification counter in the dict, copy the counter into the view; if the counters don't match, the view self-materializes into a list).
@brandjon pointed out an implementation concern: once the dict is frozen, the view might be used from multiple starlark threads, so materializing the a view into a list would need to be done in a thread-safe manner.
I just came across some bzl code in Google's codebase that was running into this performance issue. Working around the issue reduced the package loading wall time of a single extreme package (for which this issue was a hotspot) from ~115s to ~16s!
Examples include
dict.keys
,dict.values
,dict.items
,enumerate
, andzip
. These operations return lists like in Python 2, but they could be made to return views like in Python 3.Today, the fact that they return lists means that subscripting with integers is possible, e.g.
mydict.keys()[0]
to get the first key in the iteration overmydict
. This is indeed allowed in Python 2 and not 3. We could make an incompatible change to prohibit this kind of operation. Alternatively, since Starlark has a well-defined iteration order for dictionaries, we could continue to support indexing, though it might not be constant time.This issue came up during discussion of how to implement efficient views for Bazel's
native.existing_rules
operation. @tetromino