Closed ecstatic-morse closed 3 years ago
See https://github.com/ecstatic-morse/kmerge for the microbenchmarks.
This looks good to me.
But @ecstatic-morse I think you need to run cargo fmt
. any chance you want to do that? :)
Sorry for the ridiculous delay! This PR fell under the radar.
I'll do that
edit: done. CI seems to be currently failing on nightly because of rustfmt missing, but it passed on beta.
The old implementation was asymptotically optimal and completely safe (yay!) but also more complex than necessary and not as fast as it could be. This switches to a single loop over a pair of
std::vec::IntoIter
s instead of the current combination ofdrain
andpeekable
. It also uses some unsafe "unchecked" operations where bounds checks should be elided but are not (on a recent rust nightly). Specifically, pushing an element to the output vector does not require a capacity check (seepush_unchecked
in the diff) because we allocate the maximum possible capacity up front. This would be quite difficult for the optimizer to prove. I was hoping to avoid usingget_unchecked
andnext_unchecked
, which the optimizer should be able to handle for us, but couldn't find a way to do so with stable APIs.As a result of these changes, the number of cycles spent in
merge
on theclap-rs/app-parser-{{impl}}-add_defaults
test on a Ryzen 4700U laptop went from 6% to 4% and the total number of cycles decreased by about 5% according to perf. Cycle counts are unstable so these numbers should be taken with a grain of salt, but they mirror the results of microbenchmarks.