Closed h-a-n-a closed 2 months ago
Thank you for the fantastic contribution! I'd also love to get your benchmark into the repo -- would you be willing to update this PR with it, otherwise is it okay if I pull it in? Using criterion as you've done is great.
There is some history here, which is that we previously did pointer casting from OsStr
to str
. But it was privately flagged to me that there's no repr(transparent)
relationship between the two so that wasn't sound. as_encoded_bytes
solves this, though.
would you be willing to update this PR with it
I would love to! Would you also like to integrate codspeed into the repo? If the answer is a yes, I can add a feature "codspeed"
to enable the bench case running with codspeed on CI (This requires additional CODSPEED_TOKEN
to get it running). Otherwise, I can just add the bench case.
Pre-requisites for codspeed running in GitHub Actions: https://docs.codspeed.io/ci/github-actions
Thanks again! Will get a release out shortly.
This is now out in camino 1.1.8.
Thanks for the help!
Background:
When I was migrating
PathBuf
toUtf8PathBuf
, etc, I found out some regression in our benchmarks. Then I found outas_str
is actually not cost-free as in the older version of rustc there's no way to get the underlying bytes out of anOsStr
until 1.74.0.In this PR, with the help of
OsStr::as_encoded_bytes
was stabilized in 1.74.0, We can perform a cost-free conversion from&OsStr
to&str
with constraint of it's underlying bytes areUTF-8
encoded.Benchmark:
I did an ad-hoc benchmark with the following code and turned out the time cost is a constant now.
code
```rust use std::ffi::OsStr; use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion}; fn bench_path(c: &mut Criterion) { let mut group = c.benchmark_group("match UTF-8 validation check"); for i in [10, 100, 1000, 10000] { let p = "i".repeat(i); group.bench_with_input(BenchmarkId::new("osstr to_str", i), &p, |b, i| { b.iter(|| { let a = OsStr::new(black_box(i)); let _ = unsafe { black_box(a).to_str().unwrap_unchecked() }; }) }); group.bench_with_input(BenchmarkId::new("osstr as_encoded_bytes", i), &p, |b, i| { b.iter(|| { let a = OsStr::new(black_box(i)); let _ = unsafe { std::str::from_utf8_unchecked(black_box(a).as_encoded_bytes()) }; }) }); } } criterion_group!(benches, bench_path); criterion_main!(benches); ```Result: