p2pderivatives / rust-bitcoin-coin-selection

10 stars 5 forks source link

Feature/iter branch and bound #28

Closed yancyribbens closed 9 months ago

yancyribbens commented 10 months ago

Implement Branch and Bound coin selection.

yancyribbens commented 10 months ago

While there is a lot of potential refactors, I think this is ready to move out of draft state.

yancyribbens commented 9 months ago

I found that using primitive types instead of types like Amount in the hot path leads to a 2x perf improvement which is crazy. The benchmark on this branch https://github.com/p2pderivatives/rust-bitcoin-coin-selection/tree/experimental/use-primitive-types runs at ~311,000 ns/iter as opposed to 830,000 ns/iter. While this version does not yet add a check for high vs low fee like the C++ implementation, even after adding that in, I think this version is much faster than any implementation.

Kixunil commented 9 months ago

That could be because of overflow protection in Amount or messed up benchmark. For instance I don't see black_box being used in the benchmark so it could be why the results are weird.

yancyribbens commented 9 months ago

I did remove a checked_sum() in the hotpath (and one that wasn't) so maybe that's in part why. Also, is there really no checked_sum() on Iter? Could just manually check each one while summing to see how that stacks up. Amount just wraps a u64 so I really didn't expect accessing the Amount type to be that much worse. Will look into black_box. It would be interesting to take a look at the ASM produced also to really understand the performance more - I'm not sure the benchmark is necessarily "weird"..

edit: come to think of it, using checked_sum() in the hotpath shouldn't be needed anyway, since each iteration, the sum is guaranteed to be less than or equal to the first calculation of available_value, so there should not be a reason to check for overflow.

yancyribbens commented 9 months ago

For instance I don't see black_box being used in the benchmark so it could be why the results are weird.

This article was pretty helpful in understanding blackbox: https://gendignoux.com/blog/2022/01/31/rust-benchmarks.html. I think the benchmarks are correct however currently without blackbox because here: https://github.com/p2pderivatives/rust-bitcoin-coin-selection/pull/28/files#diff-9098d62be93e83524a8371395c973d761a95000d1c295f600a8c808e917c16d9R577 I use the return value which forces the compiler to not optimize away the call to bnb(). If the compiler was optimizing away the call to bnb() it would not be measurements of 300,000+ ns per iter and vary based on changes to bnb. The reason for blackbox is to make sure that in the benchmark test itself, the compiler does not optimize away the call to the code you want to benchmark by realizing that since the results aren't being used then don't make the call at all.

yancyribbens commented 9 months ago

regarding bench-marking, it looks like criteron would be a better choice.

yancyribbens commented 9 months ago

Replaced the checked_sum() with sum() for the available_value calculation in the hotpath since it's not possible to overflow. This brings the time down to 772 ns/iter.

test branch_and_bound::benches::bench_select_coins_bnb ... bench: 772,163 ns/iter (+/- 393)

yancyribbens commented 9 months ago

Re-opening this as a PR using my remote so people don't get so many notifications on CI failure.

yancyribbens commented 9 months ago

New PR open here: https://github.com/p2pderivatives/rust-bitcoin-coin-selection/pull/30