Closed overlookmotel closed 4 months ago
This change should also produce similar speed-up on String::from_str_in
because it uses push_str
internally.
It may also be worth investigating whether from_str_in
can get a further speed-up if it doesn't use push_str
at all, because the length + offset calculations and call to Vec::reserve
in push_str
are all extraneous when the String
has been freshly created with sufficient capacity. If the compiler inlines push_str
, it would hopefully figure that out itself, but I'm not sure if it will. If it doesn't:
pub fn from_str_in(s: &str, bump: &'bump Bump) -> String<'bump> {
let len = s.len();
let mut t = String::with_capacity_in(len, bump);
unsafe {
ptr::copy_nonoverlapping(s.as_ptr(), t.vec.as_mut_ptr(), len);
t.vec.set_len(len);
}
t
}
Would hugely appreciate it if someone had time to review this PR. It appears to be a significant speed boost with no downside, as far as I can see.
OXC compiler uses bumpalo (to great effect, bumpalo is brilliant) and I imagine this small optimization to bumpalo would have a significant impact on OXC's benchmarks.
It looks like there are some test failures in CI: https://github.com/fitzgen/bumpalo/actions/runs/7848854221/job/21528762407?pr=229#step:6:110
My apologies! That's really weird, I was pretty sure I'd covered all the bases. I'll look into it.
And thanks very much for coming back. Sorry I hassled you.
Is it possible to re-run the "Rust / build (stable, --features collections,boxed)" CI job? Am wondering if it could be related to something in nightly, or genuinely 100% my stupidity.
Retriggered.
Also possible that this test is failing on main
right now, but I don't have time to investigate at the moment.
I've reproduced it locally. Yes, it is failing on main
too.
Have opened #230 with the details.
The failure does appear to be completely unrelated to this PR, so would you consider merging this in the meantime?
I've also added a bit to the test to include pushing an empty string.
CI should pass if rebased on top of #232.
This should be ready for a rebase.
Thanks. Now rebased on main.
@fitzgen I know I've failed on the follow-on for #232, but could I possibly make a request if you could do a release on crates.io containing the changes in this and #231?
It's a fairly significant speed-boost for commonly used functions, and would be a real help to be able to integrate these changes into OXC.
Yeah I can do a release shortly. Thanks for the PRs!
Thanks @fitzgen. That'd be very much appreciated. I hope I can contribute more to bumpalo in future, just right now is not ideal timing for me.
FYI 3.15.0 is now published on crates.io
FYI 3.15.0 is now published on crates.io
Amazing! Thank you.
I noticed when using
collections::String::push_str
with large slices that the performance was very poor compared tostd::string::String
.I believe the reason is that std's
Vec
has a specialised implementation ofextend_from_slice
forVec<u8>
, but bumpalo'sVec
does not (and probably can't without using nightly features).Consequently,
push_str(str)
wherestr
is 16 KB is currently equivalent to 16000 x individual calls topush()
for each byte.This PR fixes that by using
ptr::copy_nonoverlapping
to copy the entire slice in one go.For a 16 KiB string, it's a 80x speed-up:
Also added a test for
push_str
with a long string, and a benchmark.I've written the code in quite a verbose style with lengthy "SAFETY" comments. Maybe this is overkill, but you don't know me, so I wanted to make it clear from the code that the change is valid.