Closed tiif closed 4 days ago
(I have updated the issue title and description.)
There exist many string interning crates, are you leaning towards any specific one?
None of them are usable I think, since the strings need to be interned in machine memory.
This should be just a FxHashMap<Vec<u8>, Pointer<Option<Provenance>>>
.
Isn't this also part of a more general gap in allocate_str? For example, certain string constants in rustc const evaluation? https://github.com/rust-lang/rust/blob/f944afe3807104803976484e7ee3aff49b6ac070/compiler/rustc_const_eval/src/util/caller_location.rs#L24 "See https://github.com/rust-lang/rust/pull/89920#discussion_r730012398" I'm guessing the performance characteristics required are different, as well as what the cache "value" should be. But thought it was worth mentioning.
Also, though currently I don't think there any current MiriMemoryKind::Machine allocate_str calls where this isn't the case, isn't an additional necessary check for interning these strings that mutability is Mutability::Not? That "seems" obvious to me, but wanted to call it out.
caller_location strings could also use the cache, yeah. In fact the entire caller_location could be cached, but it's harder to handle that with the same generic infrastructure.
And indeed, only immutable strings should be cached.
Currently, in
localtime_r
and also for some cases where we raise panics, we allocate a fresh string each time. These strings never get deallocated so they just keep accumulating. Conceptually these are static strings so we can deduplicate them with a global cache.The cache should be used everywhere that
allocate_str
is called withMiriMemoryKind::Machine
.