rust-lang / datafrog

A lightweight Datalog engine in Rust
Apache License 2.0
796 stars 43 forks source link

Micro-optimize `binary_search` #30

Closed ecstatic-morse closed 3 years ago

ecstatic-morse commented 3 years ago

We spend about 30% of cycles doing binary search in ExtendWith::count while running clap-rs/app-parser-{{impl}}-add_defaults. This indicates to me that we need to explore different storage for the cfg_edge relation, but a maximally optimal binary_search is beneficial regardless.

This PR switches to get_unchecked to eliminate a bounds check that the optimizer cannot. This is also done in the standard library implementation.

It also takes advantage of a lesser known invariant of Vec, that the capacity cannot exceed isize::MAX bytes, to compute the midpoint using less instructions (there's a fun article from the early internet about this).

As a result, the aforementioned test case on a Ryzen 4700U laptop on a goes from this:

 Performance counter stats for 'target/release/polonius -a DatafrogOpt inputs/clap-rs/app-parser-{{impl}}-add_defaults':

   691,011,608,622      instructions:u                                              

      99.967872602 seconds time elapsed

      96.724862000 seconds user
       3.172634000 seconds sys

to this:

 Performance counter stats for 'target/release/polonius -a DatafrogOpt inputs/clap-rs/app-parser-{{impl}}-add_defaults':

   623,280,710,902      instructions:u                                              

      92.983953355 seconds time elapsed

      89.483132000 seconds user
       3.404928000 seconds sys

Wall time is highly variable and may differ on your platform.

cc @shepmaster (since they seem to be interested in this sort of thing)

ecstatic-morse commented 3 years ago

Also, we still use the standard library's binary search implementation in some places, which doesn't have the second (admittedly less important) optimization. If rust-lang/rust#74024 lands, we'll definitely want to switch to a custom impl in every case, since it will double the number of comparisons in the inner loop for little benefit.

camelid commented 3 years ago

You might want to r? somebody to get their attention.

lqd commented 3 years ago

We have talked about this PR on zulip, hopefully niko has some time for this (if not we will look at this at the next sprint).

I don't think bors is active on this repo, so let's ask manually via github 🙏

nikomatsakis commented 3 years ago

Looks good to me!