bnb-chain / opbnb

MIT License
404 stars 162 forks source link

feat(op-node): pre-fetch receipts concurrently round 2 #104

Closed welkin22 closed 8 months ago

welkin22 commented 8 months ago

Description

I implemented the first version of concurrent code to pre-fetch receipts at https://github.com/bnb-chain/opbnb/pull/100, but there are some problems that need to be fixed. Our cache is an LRU cache with a maximum capacity of 1000 items. When the goroutine implementation is changed to concurrent, the maximum capacity is quickly reached. This leads to the following problems:

  1. Since the cache is LRU, the Get method will insert data of old blocks that were about to be evicted into the front of the queue, causing a large amount of useless data to accumulate in the cache while useful data is pushed out.
  2. Concurrency makes it very easy for the cache to become full, resulting in old block data being quickly evicted and the cache becoming ineffective.

Rationale

I made some changes. In the pre-fetched goroutine, I removed the rateLimit and used waitGroup instead. I also added the PreFetchReceipts method, which indicates whether the cache has been filled. If it is filled, the pre-fetched goroutine will wait and retry. At the same time, I abandoned the LRU cache data structure because it does not meet the requirements of pre-fetch. We need to evict receipts data from memory when the corresponding L1 block height is used up. Therefore, a priority queue sorted by block height from smallest to largest is a better choice. I created a new data structure called preFetchCache to cache receipts data. This structure is composed of a map and a priority queue, it will assist pre-fetch in deleting old block height receipts data at the appropriate time. So when is the appropriate time to delete receipt data? I inserted the method ClearReceiptsCacheBefore into postProcessSafeL2 because when the safe block height processing is completed, it means that the corresponding L1 original block height receipt data is no longer needed. To be more cautious, we only delete data that is lower than the block height, without including it.

Example

none

Changes

Notable changes:

welkin22 commented 8 months ago

When L2UnSafe.L1Origin - L2Safe.L1Origin > 1000, this receiptsCache will cache block receipts in the range of [L2Safe.L1Origin, L2Safe.L1Origin+1000), however, block receipts of L2UnSafe.L1Origin will not be cached. Could this result in the sequencer producing blocks more slowly? If yes, is this expectedly?

Yes, if the distance between unsafe and safe is too large, the cache cannot take care of both. This is unexpected, but 1000 blocks in our use case is equivalent to 50 minutes, and scenarios where the difference is 50 minutes are rare. If there is no better solution, I tend to temporarily ignore it.