This undoes a previous change to unify them, and I think at my advice. =[ Sorry about that, I think I was just wrong.
Specifically, I think I had suggested that it would be more efficient to have a single shared hashtable of strings. The more I look at profiles of the toolchain, the less likely that seems. Specifically for identifiers and string literals it seems especially problematic.
Using a single, joint hashtable is likely a good idea when all of the different querying code paths are equally likely, the strings follow the same distribution of sizes, and either there is no clustering of access to different sets of strings or none of the sets are meaningfully small enough to fit into a lower level of resident cache.
I think essentially none of these predicates actually hold for identifiers vs. string literals:
Identifiers are much more hot
They have wildly different size distributions.
The access patterns are very clustered
Sorry for the misleading advice on that one.
While splitting them, I've worked to simplify the code a bit by building a way to have the StringRef holding canonical value stores not require specializations, and so we get a pretty large code cleanup in the process here.
This undoes a previous change to unify them, and I think at my advice. =[ Sorry about that, I think I was just wrong.
Specifically, I think I had suggested that it would be more efficient to have a single shared hashtable of strings. The more I look at profiles of the toolchain, the less likely that seems. Specifically for identifiers and string literals it seems especially problematic.
Using a single, joint hashtable is likely a good idea when all of the different querying code paths are equally likely, the strings follow the same distribution of sizes, and either there is no clustering of access to different sets of strings or none of the sets are meaningfully small enough to fit into a lower level of resident cache.
I think essentially none of these predicates actually hold for identifiers vs. string literals:
Sorry for the misleading advice on that one.
While splitting them, I've worked to simplify the code a bit by building a way to have the
StringRef
holding canonical value stores not require specializations, and so we get a pretty large code cleanup in the process here.