carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
https://github.com/carbon-language/carbon-lang/blob/trunk/README.md
Other
32.31k stars 1.48k forks source link

Remove another hashtable iteration order dependency. #4070

Closed chandlerc closed 1 week ago

chandlerc commented 1 week ago

Name scopes store the names in their scope in a DenseMap. Several places reasonably avoid depending on the iteration order by sorting the names -- they're in the formatting code path where that's a solid approach.

Unfortunately, when we're importing one scope into another, we also need to walk the entire scope and do something for each name. =[ This doesn't seem like a great place to sort things to stabilize them.

I've switched to a fairly simplistic solution of having a vector of name entries that can be iterated stably, and a separate map for lookups. I didn't use the set-of-indices trick here because it's not clear that's the right trade-off for a scope: likely a lot of small scopes here with relatively hot name lookups. And the key here isn't a large or dynamically sized thing that we're canonicalizing, it's a NameId. That made me lean towards duplicating the name in the hashtable for lookup and the vector for iteration.

I thought about a fancy approach of sorting the hashtable keys by their values (the indices), but that would still require a bit of copying and more code.

I also thought a bit about other optimizations, but decided to leave a comment for now -- it's not obvious to me exactly how hot this is and whether it's better served by faster lookups, being more memory dense, etc. And that might involve more of an SOA layout change or some other approach. Rather than do that here, and especially before switching hashtables, I stuck with a simple approach to address the ordering.

chandlerc commented 1 week ago

Looks like this is also fixing at least one iterator invalidation bug :)

Just for posterity -- I don't think so... but maybe in a very surprising way. Because this is a DenseMap and not a SmallDenseMap, the iterator into that map isn't invalidated by the map being moved when the vector containing scopes (and thus the map object itself) grows and reallocates.

This PR adds a direct use of the parent scope, which is what actually blows up here when we try to access parent_scope->names after injecting more scopes.

But ... yeah, that was a surprising part of this PR to debug.