Closed fitzgen closed 7 years ago
Sort of.
We want impl From<String> for InlinableString
.
Here's another example:
impl postgres::types::FromSql for InlString {
fn from_sql<R: Read> (ty: &postgres::types::Type, raw: &mut R, _: &postgres::types::SessionInfo) -> postgres::Result<InlString> {
use postgres::error::Error; use postgres::types::Type;
if *ty != Type::Varchar && *ty != Type::Text {
return Err (Error::from (io::Error::new (std::io::ErrorKind::Other,
format! ("InlString] Unknown PostgreSQL type: {:?}", ty))))}
let mut stack = [0u8; 32];
let got = try! (raw.read (&mut stack));
if got <= inlinable_string::inline_string::INLINE_STRING_CAPACITY {
let s = match from_utf8 (&stack[0..got]) {Ok (s) => s, Err (err) => return Err (Error::Conversion (box err))};
Ok (InlString (InlinableString::Inline (InlineString::from (s))))
} else {
let mut buf = Vec::with_capacity (256);
buf.extend_from_slice (&stack[0..got]);
try! (raw.read_to_end (&mut buf));
let s = match from_utf8 (&buf[..]) {Ok (s) => s, Err (err) => return Err (Error::Conversion (box err))};
Ok (InlString (InlinableString::Heap (String::from (s))))}}
It might be made into a generic From<Read>
.
We want impl From
for InlinableString.
That's impossible to test outside the inlinable_string
crate, AFAIK.
(You need the newtype to pin the external traits on external types).
This is the inlinable_string
crate... I think we are talking past each other...
Code examples I gave aren't from inlinable_string
.
I should check issues before doing PRs. Just run into this, so #6.
I figured there is no point checking the lenght and doing an inline if you already have owned heap.
I figured there is no point checking the lenght and doing an inline if you already have owned heap.
@maciejhirsz
On the contrary!
Using the heap means an extra random memory access!
Consider for (key, value) in btree_map {if key.starts_with ("f") {count += 1}}
. If the key is InlinableString
without a heap reference then we're doing sequential memory scan. Both the CPUs and the operating systems (with their virtual memory subsystems) are supposed to be able to optimize for this (with various level of prefetching).
Consider for s in vec {if s.starts_with ("f") {count += 1}}
. No heap means approximately one memory fetch per two strings (assuming the usual 64-byte cache lines). Heap means approximately three memory fetches per two strings.
If you're using a heap reference than you're fetching random pieces from all over the heap. This is especially bad if some of your memory is swapped out.
Another example, a large long-living map. If the strings are stored inline then the memory pressure from that map is minimal (e.g. your program takes less memory). If the strings are on the heap then you're paying the full heap overhead (padding and metadata of the allocator).
Hah. That does actually make a lot of sense. I've updated the PR.
Fixed in #6
@maciejhirsz Benchmark in https://github.com/rust-lang/rust/pull/40601#issue-215029505 might be interesting regarding how the heap fetches (e.g. sort_large_strings
) fare against inline values (e.g. sort_large_big_random
), methinks.
Might be a useful template if someone wants to experiment with benchmarking: https://gist.github.com/ArtemGr/cecca70d27178a1f1210df7bd53cf167
@ArtemGr are you interested in co-maintaining this crate?
@fitzgen Yes, I wouldn't mind joining in.
@ArtemGr Great, thanks!
How does this sound for process: we lock down the master branch and require an approval review (presumably from one of us) and passing tests for anything to land on it?
@fitzgen NP! I never worked with protected branches before, so... sure, looks like an interesting thing to try, at the very least.
@ArtemGr cool! Invite sent.
For example?