paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.85k stars 670 forks source link

Refactoring analysis: Analyse the usage of block numbers #53

Open the-right-joyce opened 2 years ago

the-right-joyce commented 2 years ago

Research where block numbers are used (who needs to access block numbers and why)


Current status: https://hackmd.io/@_FY3-hvwQZ6cX_4n8zYUNA/rymYZLZlp

michalkucharczyk commented 1 year ago

BlockId refactoring status and next steps

  1. Some of core traits are already cleaned from BlockId, however there are still some methods that are related to BlockId or block number. The summary of use cases is in the sections to follow.
  2. Reworking header function led to some call sequences that are far from being perfect, example. Namely, in some modules there is a requirement to fetch header using block number, current PR works around this by calling header(hash(n)) combo.
  3. In some cases expect_header(block_id) is called to hide the combo, (as expect_header was not reworked yet)
  4. To solve this "problem" HeaderBackend trait could be extended with dedicated functions for fetching header by block number: -- header_for_number(NumberFor<Block>) -> Result<Option<Block::Header>> -- expect_header_for_number(NumberFor<Block>) -> Result<Block::Header> -- header(Block::Hash) -> Result<Option<Block::Header>> -- expect_header(Block::Hash) -> Result<Block::Header> New naming may also make finding all block-number usages and their refactorization easier. I would follow with this naming convention in needed for the rest of methods (e.g. block, block_status) Implementation of _for_number methods could make use of new service which will move number-hash mapping out of DB.

Where the BlockId is currently used:

Still in client API:

In runtime_api() subsystem:

Refactoring here is probably doable, requires some adjustments in runtime-related macros. Question if we want to get rid of BlockId parameter to runtime functions. This maybe inconvenient in some cases. Maybe we should introduce dual functions (following _for_number approach) at the top-level runtime API.

Other usages

Network tests utils

utils

Transaction pool:

As transaction verification is done via runtime calls, the API for submitting also uses BlockId.

Where Block number is used

not completed yet

client/db/src/lib.rs fn prune_blocks: 1753 fn prune_displaced_branches

client/finality-grandpa/src/warp_proof.rs|100 WarpSyncProof::generate

client/finality-grandpa/src/environment.rs 1249 pub(crate) fn finalize_block

client/rpc/src/chain/mod.rs|83 ChainBackend::block_hash

client/finality-grandpa/src/import.rs|366 GrandpaBlockImport::make_authorities_changes

client/service/src/client/client.rs|1036 Client::block_status

client/network/sync/src/lib.rs|804 ChainSync::on_block_data / AncestorSearch

primitives/blockchain/src/backend.rs|227 trait Backend::best_containing

primitives/blockchain/src/backend.rs|53 HeaderBackend::block_hash_from_id

kianenigma commented 1 year ago

What's the end goal here? to make everything block number? or hash?

bkchr commented 1 year ago

What's the end goal here? to make everything block number? or hash?

Make as much as possible use hash, as hashes are unique. We will not be able to do this for everything as RPC or CLI functionality will always need to pass numbers. However, the "end goal" is that block numbers may go out of the internal handling of the db.

bkchr commented 1 year ago

4. To solve this "problem" HeaderBackend trait could be extended with dedicated functions for fetching header by block number: -- header_for_number(NumberFor<Block>) -> Result<Option<Block::Header>> -- expect_header_for_number(NumberFor<Block>) -> Result<Block::Header> -- header(Block::Hash) -> Result<Option<Block::Header>> -- expect_header(Block::Hash) -> Result<Block::Header> New naming may also make finding all block-number usages and their refactorization easier. I would follow with this naming convention in needed for the rest of methods (e.g. block, block_status) Implementation of _for_number methods could make use of new service which will move number-hash mapping out of DB.

If this helps moving forward with this issue, it should be fine!

michalkucharczyk commented 1 year ago

BlockId refactoring status and next steps (2)

Where the block number is actually used

Some usages may still be missing, but the list shall better then in previous comment:

Where the BlockId is currently used:

Still in client API:

Seems to be fine, just conversion functions.

Implementation

Other usages

Network tests utils

utils

Transaction pool:

As transaction verification is done via runtime calls, the API for submitting also uses BlockId. Since runtime API is reworked this should be easy doable.