clarkmcc / cel-rust

Common Expression Language interpreter written in Rust
https://crates.io/crates/cel-interpreter
MIT License
377 stars 21 forks source link

Porting Arc<String> to Arc<str> #23

Closed clarkmcc closed 1 year ago

junderw commented 1 year ago

Also, in general, converting a string into an Arc<str> requires allocating the length of the string, but Arc<String> only requires allocating a fixed size 3 pointer wide struct inside the Arc.

So performance really depends on the average length of strings used in the wild... if your benchmarks use data representative of real world usage, then maybe Arc<String> is a better choice?

clarkmcc commented 1 year ago

@junderw it seems like the former would require cloning the string (re-allocating the length of the string), but doesn't the latter consume the string and require only a constant time allocation of the Arc itself?

Arc::from(String::from("Hello world").as_str());
Arc::from(String::from("Hello world"));
junderw commented 1 year ago
// 1.String::from performs one allocation of the length str.len()
// 2. and Arc::from performs another allocation that is the length of 2 AtomicUsizes + str.len()
Arc::from(String::from("Hello world").as_str());

// 1.String::from performs one allocation of the length str.len()
// 2. and Arc::from performs another allocation that is the length of 2 AtomicUsizes + 3 usizes (the String struct)
Arc::from(String::from("Hello world"));

The allocation time is dependent on the length of the string, but also, Arc<String> requires 2 pointer indirections to get at the actual string data, whereas Arc<str> is 1 pointer indirection (from Arc to ArcInner)

So on a theorhetical level, you should think of what sizes the strings are, what bit-size CPU will you run on, and how often will users of this struct be accessing the underlying string data through the Arc.

It's a balance of tradeoffs. There is no correct answer.

junderw commented 1 year ago

The only way to be 100% certain is to write some benchmarks that mimic real-world use.