bheisler / criterion.rs

Statistics-driven benchmarking library for Rust
Apache License 2.0
4.55k stars 301 forks source link

Remove Unsafe Blocks #195

Open bheisler opened 6 years ago

bheisler commented 6 years ago

There are a lot of old unsafe blocks in Criterion.rs' internals. Many of them were written to work around compiler bugs or missing features that have since been fixed (see https://github.com/japaric/criterion.rs/issues/188#issuecomment-416001204). These should be cleaned up.

There are a few others that might be harder to remove without impacting analysis performance in the bootstrapping code. I've tried several times to remove these without much success so far.

gnzlbg commented 6 years ago

I managed to remove many unsafe blocks in #208 . Most of the remaining ones are due to the usage of thread_scoped, here we should migrate criterion either to crossbeam::scope, or maybe even just rayon. cc @cuviper thoughts?

The remaining unsafe blocks are only in the stats crate and hint about the crate's problematic design. For example, some transmutes are just used to workaround some types not having new methods (to subvert privacy boundaries), others are also used to subvert the type system in weird ways: for example, Sample<A: Float> but Distribution<A> does not require A: Float, yet distributions with As that are not floats are transmuted into Samples.

cuviper commented 6 years ago

or maybe even just rayon. cc @cuviper thoughts?

How could I say no? 😉 But yes, indeed, those look like prime candidates for parallel iterators.

If you wanted to get fancy, those Builders could probably implement FromParallelIterator to directly fill their target vectors, but it's probably good enough to leave it with sub_distributions and the extend merging at first.

bheisler commented 6 years ago

I've tried using Rayon there, in fact. It was a while ago, though, so it might work better now. The problem I ran into is that, well, some of that code is super complex (I'm looking at mixed.rs here). There is a thread-local structure (Resamples) which holds a borrow of the shared vector c. I couldn't figure out a way to do that in Rayon without creating a new Resamples instance for every iteration (potentially expensive because it contains a random number generator as well, which needs to be initialized from the thread_rng()) or crossing the borrow checker or both. I believe I also tried to implement FromParallelIterator for the Builders and gave up eventually, though I don't remember why. The other obvious approach - using split_at_mut and giving each chunk to a different scoped thread - also doesn't work, because of the way it tries to be generic over the number of output vectors.

Anyway, perhaps you folks will have more success than I did. I always meant to rip that code out and replace it with something more maintainable, but I didn't want to make it much slower. I think I might have to compromise on the performance (accept the cost of creating a new RNG for each iteration) or the generality (stop trying to be generic over the number of outputs and unzip them explicitly elsewhere).

gnzlbg commented 6 years ago

FWIW I gave it a quick shot too, and decided that it was tricky enough to deserve its own PR. I started with crossbeam, but while making the changes got the feeling that rayon might actually be enough.

cuviper commented 6 years ago

There is a thread-local structure (Resamples) which holds a borrow of the shared vector c. I couldn't figure out a way to do that in Rayon without creating a new Resamples instance for every iteration (potentially expensive because it contains a random number generator as well, which needs to be initialized from the thread_rng()) or crossing the borrow checker or both.

I'm hoping that rayon-rs/rayon#602 will help this kind of use case.

bheisler commented 6 years ago

That does look like it would help, yes.

lemmih commented 3 years ago

Status update. Criterion contains 10 unsafe blocks and 1 unsafe function:

  1. https://github.com/bheisler/criterion.rs/blob/cee63459dc58f86a9c319bd4fa7e762f7c333bf6/src/stats/univariate/percentiles.rs#L20
  2. https://github.com/bheisler/criterion.rs/blob/cee63459dc58f86a9c319bd4fa7e762f7c333bf6/src/stats/univariate/percentiles.rs#L52
  3. https://github.com/bheisler/criterion.rs/blob/cee63459dc58f86a9c319bd4fa7e762f7c333bf6/src/stats/univariate/percentiles.rs#L57
  4. https://github.com/bheisler/criterion.rs/blob/cee63459dc58f86a9c319bd4fa7e762f7c333bf6/src/stats/univariate/percentiles.rs#L67
  5. https://github.com/bheisler/criterion.rs/blob/cee63459dc58f86a9c319bd4fa7e762f7c333bf6/src/stats/univariate/percentiles.rs#L72
  6. https://github.com/bheisler/criterion.rs/blob/cee63459dc58f86a9c319bd4fa7e762f7c333bf6/src/stats/univariate/sample.rs#L31
  7. https://github.com/bheisler/criterion.rs/blob/cee63459dc58f86a9c319bd4fa7e762f7c333bf6/src/stats/univariate/sample.rs#L133
  8. https://github.com/bheisler/criterion.rs/blob/cee63459dc58f86a9c319bd4fa7e762f7c333bf6/src/stats/mod.rs#L84
  9. https://github.com/bheisler/criterion.rs/blob/976d51730ce479d3d032d2dbb176022e45b22b57/src/stats/univariate/resamples.rs#L55
  10. https://github.com/bheisler/criterion.rs/blob/e2284444ead935df850cb69a7b5ec7f6142ff82b/src/stats/bivariate/mod.rs#L126
  11. https://github.com/bheisler/criterion.rs/blob/976d51730ce479d3d032d2dbb176022e45b22b57/src/lib.rs#L173

    • One of them is due to black_box and is pretty unavoidable.
    • Two of them are for transmuting between &[A] and &Sample<A>. This is UB since Sample<A> is not guaranteed to share the representation of [A]. We need to add #[repr(transparent)].
    • The rest are for unchecked slice indexing. Surely we can remove these without hurting performance too much. Some kind of benchmarking library would be helpful here.
bheisler commented 3 years ago

The black_box unsafe block may be fixable with a future version of rustc; the various teams have been (slowly) working towards standardizing a black box in the std::hint namespace. When that lands (which could be a while, mind) we can finally get rid of our fake black_box function entirely, or just make it delegate directly to the one in std.