To avoid excessive duplication, we wish to use Rc here to avoid having thousands of duplicated strings.
But even with InternSet and Rc, we end up creating garbage:
Parser sees a &str.
Parser creates a struct like Keyword, which clones the slice into a new String.
Parser or consumer looks up the Keyword (wrapped in an Rc) in InternSet, which drops the new keyword and returns the existing one.
It would be good to come up with a strategy for going straight from the &str to the Rc<Keyword>. This might be a factory method provided to the parser. It might be an extended EDN representation (edn::Value<InternedString>?). It might be a kind of Borrow hook.
It's also possible that LLVM will optimize away that allocation… but I doubt it, particularly in the way we currently work (which means collecting all of these Keywords and getting rid of the duplicates later).
This ticket involves a fair amount of good Rust judgment, so it's not suitable for beginners.
Each of our structs keeps an owned value.
To avoid excessive duplication, we wish to use
Rc
here to avoid having thousands of duplicated strings.But even with
InternSet
andRc
, we end up creating garbage:&str
.Keyword
, which clones the slice into a newString
.Keyword
(wrapped in anRc
) inInternSet
, which drops the new keyword and returns the existing one.It would be good to come up with a strategy for going straight from the
&str
to theRc<Keyword>
. This might be a factory method provided to the parser. It might be an extended EDN representation (edn::Value<InternedString>
?). It might be a kind ofBorrow
hook.It's also possible that LLVM will optimize away that allocation… but I doubt it, particularly in the way we currently work (which means collecting all of these
Keyword
s and getting rid of the duplicates later).This ticket involves a fair amount of good Rust judgment, so it's not suitable for beginners.