carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
https://github.com/carbon-language/carbon-lang/blob/trunk/README.md
Other
32.3k stars 1.48k forks source link

Use separate value stores for identifiers and string literals #4106

Closed chandlerc closed 2 days ago

chandlerc commented 2 days ago

This undoes a previous change to unify them, and I think at my advice. =[ Sorry about that, I think I was just wrong.

Specifically, I think I had suggested that it would be more efficient to have a single shared hashtable of strings. The more I look at profiles of the toolchain, the less likely that seems. Specifically for identifiers and string literals it seems especially problematic.

Using a single, joint hashtable is likely a good idea when all of the different querying code paths are equally likely, the strings follow the same distribution of sizes, and either there is no clustering of access to different sets of strings or none of the sets are meaningfully small enough to fit into a lower level of resident cache.

I think essentially none of these predicates actually hold for identifiers vs. string literals:

Sorry for the misleading advice on that one.

While splitting them, I've worked to simplify the code a bit by building a way to have the StringRef holding canonical value stores not require specializations, and so we get a pretty large code cleanup in the process here.