paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.88k stars 687 forks source link

Transaction weight bloated by 2.5x in the latest release #6403

Open s0me0ne-unkn0wn opened 1 day ago

s0me0ne-unkn0wn commented 1 day ago

The full TransferKeepAlive actual transaction weight, when included in a block, used to be ≈293μsec in parachains using Substrate weights. That included (roughly) base extrinsic weight (107μsec), transaction's benchmarked weight (38μsec), one storage write (100μsec), and one storage read (25μsec), which sums up to 270μsec. I'm not sure where other 23μsec were going but that was close enough.

After merging the latest master, in the very same setup I'm getting 732μsec per transaction :grimacing: Okay, the benchmarked weight went up from 38 to 61μsec due to https://github.com/paritytech/polkadot-sdk/pull/5533#issuecomment-2417331434, but all the other weights have remained the same.

In practice, that means the parachain sTPS degradation from (roughly) 925 TPS to 370 TPS.

What has happened to transaction weights recently, and how can it be fixed?

CC @eskimor @sandreim @skunert

bkchr commented 1 day ago

https://github.com/paritytech/polkadot-sdk/blob/master/substrate/frame/system/src/extensions/weights.rs the extensions have now a weight. Just alone the ones listed there are around ~325μsec. The base transaction weight should now be almost useless, because the extensions report their own weight. Also not sure if the base transaction weight before was only measured or also calculated using 25μsec and 100μsec for reads/writes.

gui1117 commented 1 day ago

This could be due to https://github.com/paritytech/polkadot-sdk/pull/3685

With this PR transaction extension declare their own weight. But base extrinsic calculation hasn't changed and still includes the transaction extensions.

So some weight could be counted 2 times.

How do you obtain this result? By calculating the pre disparch information or by measuring with bench extrinsic?

ggwpez commented 1 day ago

Also not sure if the base transaction weight before was only measured or also calculated using 25μsec and 100μsec for reads/writes.

It was measured, but on an empty snapshot.

The base transaction weight should now be almost useless, because the extensions report their own weight.

👍

georgepisaltu commented 1 day ago

We still need to re-bench the substrate weights, the ones merged are kind of old. Let's try that and see if/how much it improves.

s0me0ne-unkn0wn commented 1 day ago

Just alone the ones listed there are around ~325μsec

Hmmm, but that doesn't add up either. So the old one is 107+38+100+25=270, and the new one is 107+61+100+25+325=618, where 107 is added redundantly. Still interested where the difference between 732 and 618 goes, but otherwise it explains a lot, thank you!

We still need to re-bench the substrate weights, the ones merged are kind of old. Let's try that and see if/how much it improves.

Please don't do that until we have some mitigation for https://github.com/paritytech/polkadot-sdk/pull/5533#issuecomment-2417331434

We've just dropped our TPS by 2.5x with the extension weights, give us some time to recover :sweat_smile:

georgepisaltu commented 1 day ago

I started the effort here but it won't be merged soon, there's still some fixes to be made to the bench bot.

s0me0ne-unkn0wn commented 1 day ago

How do you obtain this result? By calculating the pre disparch information or by measuring with bench extrinsic?

Just submitting transactions and watching their weights in PJS :melting_face:

bkchr commented 1 day ago

Tbh, I don't get why you are not just reducing the read/write db operations constants in your runtime. For benchmarking polkadot, you need to stay inside the 2s limit. How you stay in this limit, what kind of machine you use to build these block, polkadot doesn't care. You just need to stay inside these limits. From the chain perspective these numbers are also not that important, they are only important to ensure that blocks can be imported in a limited amount time.

Tldr, change the weights as you need. You just need to ensure that the block stays in the limits on the relay chain.

gui1117 commented 1 day ago

From substrate weight this is what I read on master:

pub type TxExtension = (
    frame_system::CheckNonZeroSender<Runtime>, //527_000
    frame_system::CheckSpecVersion<Runtime>, // 4_160_000
    frame_system::CheckTxVersion<Runtime>, // 439_000
    frame_system::CheckGenesis<Runtime>, // 4_160_000, 1r // BlockHash block 0 should be white-listed probably
    frame_system::CheckEra<Runtime>, // 6_523_000, 1r // BlockHash current block should also be white-liststed probably
    frame_system::CheckNonce<Runtime>, // 6_000_000, 1r, 1w // Account
    frame_system::CheckWeight<Runtime>, // 4_700_000, 1r, 1w // AllExtrinsicsLen should be whitelisted
    pallet_skip_feeless_payment::SkipCheckIfFeeless<
        Runtime,
        // Native: 35_263_000, 3r
        // `Author` should be whitelisted
        // `Digest` should be whitelisted
        // `NextFeeMultiplier` should be whitelisted
        // (if using asset payment: 113_992_000, 7r, 4w)
        pallet_asset_conversion_tx_payment::ChargeAssetTxPayment<Runtime>,  >,
    frame_metadata_hash_extension::CheckMetadataHash<Runtime>, // 0
);

// Transfer keep alive: 61_290_000, 1r, 1w
// + base_extrinsic

So there 6 read and 1 write are overestimated (should be in cache). Then IIRC the base extrinsic doesn't count the read and writes, so 1 read and 1 write was underestimated (the sender account, to check the nonce)

So it seems we have + 62 μsec + 7 read + 2 write = + 437 from the transaction extensions, so it adds up. We should probably fix it to + 1 read + 1 write (the account read and write was missed before I think). Because 6 read and 1 write should be in cache, and the + 62 μsec should probably be reduced from base extrinsic.

georgepisaltu commented 15 hours ago

For CheckEra, the block hash of the current block is already whitelisted as it is generated using <Pallet<T>>::block_number(). The read should come from here, where we read the birth block.

For CheckGenesis, we set the proof size to 0 in the extension's weight function, but the read weight is still there, so we should fix that.

For CheckWeight, this has been fixed but the weights in master are out of date. See here for some new results.

For the transaction payment extensions, I think it's ok to whitelist those Author, Digest and NextFeeMultiplier for the reads.