mozilla / mentat

UNMAINTAINED A persistent, relational store inspired by Datomic and DataScript.
https://mozilla.github.io/mentat/
Apache License 2.0
1.65k stars 115 forks source link

Directly intern values from an input str slice #327

Open rnewman opened 7 years ago

rnewman commented 7 years ago

Each of our structs keeps an owned value.

To avoid excessive duplication, we wish to use Rc here to avoid having thousands of duplicated strings.

But even with InternSet and Rc, we end up creating garbage:

It would be good to come up with a strategy for going straight from the &str to the Rc<Keyword>. This might be a factory method provided to the parser. It might be an extended EDN representation (edn::Value<InternedString>?). It might be a kind of Borrow hook.

It's also possible that LLVM will optimize away that allocation… but I doubt it, particularly in the way we currently work (which means collecting all of these Keywords and getting rid of the duplicates later).

This ticket involves a fair amount of good Rust judgment, so it's not suitable for beginners.

ncalexan commented 7 years ago

I wonder if we should get one of the Servo string cache developers to talk through their use case and how they approach this problem.