skunert commented 1 month ago

The benchmark overhead command should be implemented for frame-omni-bencher.

Currently its not clear how to run benchmark overhead for a parachain.

Omni bencher currently does not support the command. https://github.com/paritytech/polkadot-sdk/blob/6911f50a3424c12f67c909e6c47046354929329b/substrate/utils/frame/omni-bencher/src/command.rs#L134-L134

Polkadot-parachain does not support it either. https://github.com/paritytech/polkadot-sdk/blob/6911f50a3424c12f67c909e6c47046354929329b/cumulus/polkadot-parachain/src/command.rs#L489-L513

And you can not use the polkadot binary because it is not having the same hostfunctions enabled (e.g. ext_storage_proof_size_storage_proof_size_version_1).

Noticed while playing with https://github.com/paritytech/polkadot-sdk/pull/5283

ggwpez commented 1 month ago

Polkadot-parachain does not support it either.

Oh... adding it to the omni-bencher would be good long term, yes. We can probably re-use some mechanisms from the try-runtime-cli since that is also building blocks.
The annoying part are the Inherent Data Providers, since they are on the node side and we dont know in advance what inherent data the runtime needs.
So the omni-bencher needs to inject all possible IDPS that we have (and also the ones from ORML etc). Otherwise if one inherent is missing then the block will not build.

skunert commented 1 month ago

For a start, the omni-bencher could support what the omni-node will support. That should be fine for most use cases.

ggwpez commented 1 month ago

For a start, the omni-bencher could support what the omni-node will support. That should be fine for most use cases.

WDYM? IIUC the omni-node would just call into the shared omni-bencher code. This de-duplicates all of this logic and puts it into one place.
But now thinking about it, the omni-node would be a superset of the omni-bencher. So maybe the omni-bencher would not be needed anymore 🤔
Currently it is nice to have though, since it compiles faster in CI and does not require any features to be enabled.

skunert commented 4 weeks ago

My impression was that we will have no benchmarking related code in the omni node and people will use the omni bencher for benchmarking exclusively.

This is also what is outlined here: https://github.com/paritytech/polkadot-sdk/issues/4966

ggwpez commented 4 weeks ago

My impression was that we will have no benchmarking related code in the omni node and people will use the omni bencher for benchmarking exclusively.

Yes even better. I thought you meant the opposite with this:

For a start, the omni-bencher could support what the omni-node will support. That should be fine for most use cases.

But nevermind, seems like we think the same thing.

kianenigma commented 2 weeks ago

On the benchmark side, everything should be possible to do in frame-omni-bencher.

As noted, omni-node will have no old-school, runtime-coupled benchmarking abilities.

As noted here, we could embed frame-omni-bencher as-is into the omni-node as well.

skunert commented 2 weeks ago

I investigated the steps needed for an omni-node friendly benchmark-overhead command.

Context

The benchmark overhead command has the job of providing us with two things:

Base block weight
Base extrinsic weight

To calculate these, it executes a block and monitors its time and the proof recorded during this time. In addition, it will execute a System::remark extrinsic. Historically, only the polkadot binary supported this command; polkadot-parachain never supported it. I am even skeptical if this value was even benchmarked for most runtimes since I only see the same value everywhere. Maybe it was a reasonable guess, and we copied it everywhere?

Moving Parts

There are multiple moving parts that make this work for arbitrary runtimes.

Inherents

Since we want to support multiple runtimes, we need to supply all the inherents the runtime expects. Most parachain runtimes will have similar inherent requirements. The runtime already provides the inherents to the block-builder. All we need to do is populate the InherentData with the required values for the most popular inherents.

Building the Extrinsic

To support multiple runtimes, we must build the extrinsic dynamically based on the runtime's metadata. My first intuition is to employ subxt, which should be able to build this simple extrinsic dynamically. Alternatively, we could let users provide bytes of the extrinsic to execute.

Block Building & Client

The actual block building and client acquisition should be easy. We can use the same FakeRuntimeApi strategy that we already use in the omni-node.

Initial State

Ideally, the runtime should provide the initial state. As previously discussed, we could rely on some dev profile by default. This profile provides us with a valid genesis state on which to build. Eventually, we should inject more accounts into the genesis state to make the benchmark results more accurate. We also need to ensure that the account we signed the remark with is in the genesis state.

Question: How many accounts do we want to inject?

User Facing

The user interface can extend the benchmark overhead command v1. We also need something like --runtime, similar to pallet benchmarking. For now, we can continue to support the --chain arg to support chains that do not provide any genesis presets in the runtime.

One question that remains open for now is how to handle chains that have custom inherents or more exotic setups. We can likely ignore this in the initial version. Still, we could allow parachain teams extension of inherents and transaction building by having a minimal API, similar to what polkadot-parachain-lib is exposing now. This would mean that not everything is possible with omni-bencher, and integration into omni-node is a must.

For an initial iteration, there is also the idea of providing something like a --tx-bytes flag that allows users to pass the bytes of a prepared transaction to the benchmarking.

ggwpez commented 4 days ago

One question that remains open for now is how to handle chains that have custom inherents or more exotic setups. We can likely ignore this in the initial version. Still, we could allow parachain teams extension of inherents and transaction building by having a minimal API, similar to what polkadot-parachain-lib is exposing now. This would mean that not everything is possible with omni-bencher, and integration into omni-node is a must.

Either that, or they fork the omni-bencher, or there is a --inherent-data KEY:VALUE argument that can be used to inject any arbitrary inherent data.

For an initial iteration, there is also the idea of providing something like a --tx-bytes flag that allows users to pass the bytes of a prepared transaction to the benchmarking.

Yea good idea.

Question: How many accounts do we want to inject?

Probably should be a CLI command with a default. Currently the PoV benchmarking assumes a million entries per map - but it does not actually populate the maps, just extrapolates the proof size.

paritytech / polkadot-sdk

Add `benchmark overhead` to `frame-omni-bencher` #5303

Context

Moving Parts

Inherents

Building the Extrinsic

Block Building & Client

Initial State

User Facing