kaspanet / rusty-kaspa

Kaspa full-node reference implementation and related libraries in the Rust programming language
ISC License
491 stars 155 forks source link

Optimize window cache building for ibd #576

Closed D-Stacks closed 6 days ago

D-Stacks commented 1 month ago

Background

Currently we Cache block windows during header processing, but rely on the cache in both block body processing, and virtual processing. A cache miss on the window cache results in a large overhead in re-constructing the windows. Specifically this results in significant processing overhead during IBD, where header, and body / virtual processing is not streamlined against each other.

This PR addresses, and patches this issue (thus far, only for tn11) by:

1) caching windows in the virtual processor on new sinks, if it's window is not already cached. 2) applying an optimization whereby the window_manager will try to not reconstruct a whole window from scratch when the cache is missed, but abort early by merging with an ancestor cache (if found). 3) applying a lazy load for the past median time cache, which only triggers after a 'quick-and-dirty' pre-check that needs to find a tx where tx.locktime != 0 first, before validating it in full in check_block_transactions_in_context

Tests and Benching

I currently have created a branch, here which checks the new and old build method against each other, and logs time spent building the block windows.

all window builds are passing asserts between old and new.

regarding benching, here is an excerpt, taken during IBD:

2024-09-10 00:45:11.731+02:00 [INFO ] IBD: Processed 396000 blocks (26%) last block timestamp: 2024-09-08 16:04:14.000:+0200
2024-09-10 00:45:11.895+02:00 [INFO ] Processed 6280 blocks and 0 headers in the last 10.00s (296652 transactions; 2594 UTXO-validated blocks; 0.00 parents; 0.00 mergeset; 47.24 TPB; 94187.4 mass)
2024-09-10 00:45:21.318+02:00 [INFO ] New Total Time spent in build_block_window: 36.989105s 

            New iterations difficulty: 426446 

            New iterations pmt: 18802163 

            Old Total Time spent in build_block_window: 751.755591s 

            Old iterations difficulty: 150506854 

            Old iterations pmt: 455390110 

Note:

1) most of this performance increase is during block body IBD syncing, where the difficulty window is built, or more specifically on virtual resolve during IBD. but it also reduces time spent in past_median_time window by about 75% (from 16 secs to about 4 secs, on my machine).

2) it doesn't factor in the parallelization and real-time gain during syncing.

coderofstuff commented 1 month ago

You have the test and benches in another branch. Any way we can get those into here in a form that makes sense to run regularly?

D-Stacks commented 1 month ago

You have the test and benches in another branch. Any way we can get those into here in a form that makes sense to run regularly?

I don't have unit tests per se, Just ones that compare and bench during runtime. The runtime cost is not trivial. Maybe it can be done behind the sanity tests flag? But even then it is a bit expensive. Also maybe worth mentioning, it would leave a decent code footprint, as old code would remain running in tandem to compare and bench.