thrumdev / blobs

Blobchains on Polkadot and Kusama
https://docs-site-pi.vercel.app
Apache License 2.0
64 stars 8 forks source link

Figure out running nodes #16

Closed rphmeier closed 11 months ago

rphmeier commented 12 months ago

Storage:

CPU:

Network:

Memory:

gabriele-0201 commented 11 months ago

MaxBlockSize = ?

MaxBlockSize <= (MaxSize / ProducedBlock) = 418383.42 -> 0.4MiB

If we expect to fulfill every block all the available space then the MaxBlockSize should be very small to respect the constraint

We could add another constraint, the average consumed block space is X of the MaxBlockSize, so Data and Constraint would evolve in this way:

What does this mean? AvarageUsedBlockSize will be, as before, at most 0.4MiB but we can decide what will be the MaxBlockSize in base of how much we expect the rollups will use on average the BlockSpace:

X% MaxBlockSize MiB
10 4.
20 2.
30 1.3333333
40 1.
50 0.8
60 0.66666667
70 0.57142857
80 0.5
90 0.44444444
100 0.4

Why I think this metric could be useful? Of course we could easily go with MaxBlockSize = 0.4MiB and for sure at the end of the year MaxSize will be less then 2TB but period of intense usage followed by less one will be not considered and there would be no room for them.

Example, if X = 10% then MaxBlockSize could use 4MiB and this could leave a lot of room for rollups to test our chain for a while and then stop, things that could not be handled if the just stick with 0.4MiB.

Anyway, by default MaxBlockSize is 5MiB so I think a good expectation is to have X around 20 and 30, this would mean a MaxBlobSize = 1.5MiB

(I hope that what I wrote makes sense and I've not made any stupid mistake 😅 )

rphmeier commented 11 months ago

Thanks for the writeup, @gabriele-0201.

0.4MB is probably too small as a full limit. I wanted to return to this issue anyway today. The approach of averaging out the block size makes sense, but we should probably limit economically instead: I wouldn't mind if every block was absolutely full, as long as fees are being paid by organic demand, not spam.

We should structure the system such that we target a block fullness level of X while allowing blocks to be as large as 5MB. Blocks which are less full than X cause fees to reduce, and blocks which are more full than X cause fees to increase.

We can consider a block 'economically full' at X = 0.4MB, and target fee increases for anything beyond that. If blocks are consistently larger than 0.4MB, the prices should adapt automatically. And in that case, we should also adapt, since there is very stable demand. In practice, until asynchronous backing lands, we can target 0.8MB. By comparison, the typical Ethereum block is about 60-100KB in size.

The Polkadot fee mechanism can auto-adjust to target a particular block fullness, with both a fast and slow mechanism. We currently import the Polkadot slow adjusting fee mechanism, but should provide our own parameters based on the blob use-case.

I also would like to establish a Thrum account on the parachain which receives 5% of all fees. In that case, the demand would directly fund scaling operations.

gabriele-0201 commented 11 months ago

Blocks which are less full than X cause fees to reduce, and blocks which are more full than X cause fees to increase.

Nice! I really like this Idea!!

If blocks are consistently larger than 0.4MB, the prices should adapt automatically

Just to make sure, your idea is that if the block is constantly bigger than 0.4 MiB, then we should adjust the target fee to encourage users to use the chain, and then we can increase our 2TB maximum space because there is effectively a demand. I understood it correctly?

We currently import the Polkadot slow adjusting fee mechanism

So what you described earlier is something that is already implemented in Polkadot, and we can use it. Great! I will try to understand how to make it more tailor-made so that we can add our parameters!

rphmeier commented 11 months ago

Just to make sure, your idea is that if the block is constantly bigger than 0.4 MiB, then we should adjust the target fee to encourage users to use the chain, and then we can increase our 2TB maximum space because there is effectively a demand. I understood it correctly?

Yeah. We can definitely tweak this somewhat and maybe even make the parameters dependent on governance. The SlowAdjustingFeeUpdate struct is a concrete instantiation of the fee scheme described in the W3F research doc linked above, so we should be able to instantiate it slightly differently to achieve the desired effect.

gabriele-0201 commented 10 months ago

TargetedFeeAdjustment::convert (which is used by SlowFeeAdjustment) uses the current block weight. So, the c_traffic is changed considering ONLY weight and proof_size, without taking into consideration the effective dimension of a block.

But what we need is some sort of economic incentive to not exceed X amount of MiB per block. If there is constant demand, we should increase X and stabilize at a higher value. There is a significant difference between what TargetedFeeAdjustment does and what we want.

Initially, after studying this for only 2/3 days, would prefer to not dive into a new custom implementation because there are many tiny things that could go wrong if the c_traffic is not updated correctly. I started to think of solutions that integrate/modify the existing one to fit our needs. Here's what I came up with:

[definition: Y(c_traffic, v, s, s) = c_traffic (1 + v (s - s) + v^2 (s - s)^2 / 2)]

  1. Create a Wrapper over TargetedFeeAdjustment and:
    • If the block size is, on average, between some lower bound and X, then call the standard TargetedFeeAdjustment to evaluate new c_traffic.
    • If the current block size is considerably close to X, then override TargetedFeeAdjustment behavior and use the same formula Y(c_traffic, v, s, s) but with s being the size of the block and s being the target size.

Why this? The reason could be: if the chain is used on average, then follow the normal TargetedFeeAdjustment. Otherwise, if there are peaks or very low usage, make the c_traffic decision dependent on the block size usage. Of course, this idea makes it easy to update X if there is consistent demand.

  1. The implementation of TargetedFeeAdjustment uses the same formula Y, passing the ref_time or proof_size based on which one is closer to its limit. A possible implementation of BlobsAdjustmentFees is to copy-paste code from TargetedFeeAdjustment, adding a third possibility of limit, the block_size.

Even in this case, updating X over time if needed is doable. The only problem with that is if I make a mistake implementing this, the chain could collapse for whatever reason. But it's a challenge I could accept!

  1. Find a relationship between TargetBlockFullness (a parameter used by TargetedFeeAdjustment to define the saturation target) and X. I couldn't find a DIRECT relationship for two main reasons:
    • There could be, maybe, other extrinsic different from submit_blob in the set of executed normal extrinsics.
    • The weight of a submit_blob does not depend only on the size of the blob because the only computation that depends on the blob is the evaluation of the hash while there are many other things in the extrinsic

A possible solution could be: weight_per_byte = current_consumed_weight / total_extrinsic_len

TargetBlockFullness = weight_per_byte * X

Then, call the standard TargetedFeeAdjustment but pass a different TargetBlockFullness each block.

I don't think this relation makes perfectly sense but this solution would be perfect if I find a better relationship between TargetBlockFullness and X. In that case, it could be perfect because we would use the already battle-tested math but with some custom values we update based on the block size usage.

rphmeier commented 10 months ago

My initial thought is that we should still base the fee adjustment off of weight and not just block size, as there are other kinds of transactions that will take up weight. For example, balance transfers, or XCM reserve transfers (in or out).

Substrate weights already account for bytes in transactions, though the exact formula I don't recall. They are already designed to encompass a variety of different factors: block size, computation, and database accesses. I recommend digging into the Substrate implementation to discover the specifics.

As to fee adjustment - one thing on my mind, early on, is that we may want the collator implementation to skip blocks when there are no transactions. This would change the fee adjustment logic slightly, in that we'd have to implicitly consider those skipped blocks as having zero weight. We won't do this right away, but it should be considered in the implementation as something that might be added soon.

Since there is already an implicit conversion between weight and bytes, I believe the best path is to use a parameterization of TargetedFeeAdjustment which targets X bytes worth of weight. It is not perfect, as we may raise fees too aggressively when there is some (negligible) computation being done, but it will be close enough for a first version. Alternatively, we could have a separate weight class or fee adjustment protocol which applies only to the submit_blob, but then we need to ensure that the fee classes are kept in sync (EIP 4844 might be a useful reference, as well as https://angeris.github.io/papers/block-space.pdf ).

The other thing is that we need to make sure that there is a sensible floor on fees. If fees can adjust e.g. 30% per day and trend towards zero when unused, we could end up in a situation where it takes multiple days for fees to react to demand as we have a long way to climb up the exponential curve. I favor a slightly faster fee update anyway, as the rollup approach is intended to amortize costs for end-users anyway. For parameterization, I suggest we set it to adjust maximum 30% per hour.

We should not anticipate consistent demand - historically, blockspace demand is extremely volatile. There is often a base level of demand which usage never goes below.

In summary - the solution space here is quite large. I think for an initial version we should simply piggyback on the existing TargetedFeeAdjustment with some tweaks to parameters:

Later on, we can look into decoupling Substrate's assumptions about the relationship between computation and data and build in a more customized multidimensional fee market.

gabriele-0201 commented 10 months ago

Target block fullness set based on the existing bytes-to-weight conversion done by Substrate/Cumulus

I searched how fees and weights work in substrate to find the byte-to-weights relationship you describe, and this is what I found:

Currently, pallet_transaction_payment uses the following formula:

inclusion_fee = base_fee + length_fee + [targeted_fee_adjustment * weight_fee];

final_fee = inclusion_fee + tip;

So, here, the length of the extrinsic is directly used, but it is a 'fixed' factor, dependent only on the length of the extrinsic.

[targeted_fee_adjustment * weight_fee] is the interesting part. targeted_fee_adjustment is the value that we can change every block to make inclusion_fee dynamic, making it depend on the congestion of the network. weight_fee is the return of what is inside the annotation #[pallet::weight] on top of the extrinsic.

We have control over targeted_fee_adjustment thanks to the trait pallet_transaction_payment::MultiplierUpdate, which is used to convert the previous targeted_fee_adjustment into the new one inside on_finalize. TargetedFeeAdjustment implements the conversion using some parameters. In particular, the type TargetBlockFullness refers to the portion of the NORMAL_DISPATCH_RATIO that we adjust the fees with. To explain what portion means, let's talk about weight.

TargetedFeeAdjustment uses <frame_system::Pallet<T>>::block_weight() to extract the current weight usage of the block, and then it extracts the weight associated with normal dispatchable. The following is the Weight struct:

pub struct Weight {
    #[codec(compact)]
    /// The weight of computational time used based on some reference hardware.
    ref_time: u64,
    #[codec(compact)]
    /// The weight of storage space used by proof of validity.
    proof_size: u64,
}

The weight is constructed in the function get_dispatch_info called on the Call::submit_blob enum, and it is constructed as follows (pseudo code because the output of the macro is very long):

let __pallet_base_weight = call what's inside #[pallet::weight(...)];
let weight_fee = call to frame_support::dispatch::WeighData::weight_data providing __pallet_base_weight and the arguments of the extrinsic

The trait frame_support::dispatch::WeighData is implemented for weight in the following way:

impl<T> WeighData<T> for Weight {
    fn weigh_data(&self, _: T) -> Weight {
        return *self 
    }
}

weight_fee is not modified and thus will be the same as the output of what's inside #[pallet::weight(...)]. This is only the weight of a single extrinsic, and <frame_system::Pallet<T>>::block_weight() returns the sum of executed extrinsics in the block. The portion I talked about earlier refers to the limiting dimension: "If the proof size is the limiting dimension, then the multiplier is adjusted by the proof size. Otherwise, it is adjusted by the ref time" (extracted from TargetedFeeAdjustment::convert).

All of this is to describe how I didn't find the relationship you describe. The cost of the traffic multiplies the weight, which is composed of two factors that seem to not be directly associated with the extrinsic dimension. What I've done in #143 is trying to find a way to associate those two. Writing this, I've understood some errors I made in the initial implementation!

I'm curious to know if I'm missing something!

Adjust more quickly

Following the formula found here (https://research.web3.foundation/Polkadot/overview/token-economics#2-slow-adjusting-mechanism):

p = 0.30 (fees should increase at most 30% in one hour) k = (60 60) / 12 = 300 (blocks produced in one hour) s = TargetBlockFullness (if we follow the approach in #143, this value will change each block) v = p / k (1 - s)

Then, v will ensure that the fees change at most 30% each hour.

Set a floor which allows for very cheap but not free transactions

Currently, the transaction will be significantly cheaper than the normal assigned weight, but not free.

    pub MinimumMultiplier: Multiplier = Multiplier::saturating_from_rational(1, 10u128);

This would definitely require testing, given the variable v

rphmeier commented 10 months ago

All of this is to describe how I didn't find the relationship you describe

I was mistaken - thanks for chasing down the underlying truth. Given that weights do not relate to the underlying encoded length of the extrinsic, I took a look at configuring the transaction_payment pallet to charge fees based on size more intelligently. I'm also not deeply familiar with FRAME - most of my work at Parity was outside of FRAME and focused on consensus implementations.

There are two parts to this problem. The first is applying fees according to the size, and the second is updating fee multipliers according to the size. We should try to solve these problems in ways that encompass all transactions in the runtime.

I have spent some time to review the tools in our toolkit:

  1. Apply an adaptive multiplier in pallet_transaction_payment::Config::LengthToFee
  2. Plug in custom fee adjustment logic in the pallet_transaction_payment::Config::FeeMultiplierUpdate
  3. Adapt the Blobs::submit_blob weight with a custom WeighData that accounts for transaction length in the weight.

1. Apply an adaptive multiplier in LengthToFee

This tool only addresses applying fees and it automatically applies to every single call in the entire runtime. In our runtime this is currently a constant. We can make it dynamic.

pallet_transaction_payment invokes this hook for every single dispatched transaction. Here, we can either draw directly from the pallet_transaction_payment::NextFeeMultiplier (and use this as a hybrid with approach (2)) or have a separate multiplier in storage as a parameter type.

2. Plug in custom fee adjustment logic in the FeeMultiplierUpdate

This addresses updates of fee multipliers. The TargetedFeeAdjustment already has some logic we can adapt for choosing between two limiting factors: the execution time and the storage proof. In principle, the storage proof might contain the block data itself, as this is shipped off to relay chain validators, but in practice it only accounts for reads from the database.

My preferred approach here is similar to what was done in #143 : rather than choosing between those two limiting factors, we add a third. The difference is in the choice of the limiting factor. In #143, the pallet_blobs::TotalBlobSize is used as an additional factor. I suggest that we instead use pallet_system::all_extrinsics_len(), which is a superset of the TotalBlobSize.

3. Adapt the Blobs::submit_blob with a custom WeighData

This tool allows us to update the fees of submit_blob according to the blob size, and lets the stock TargetedFeeUpdate take control of adjustment.

Every call in every pallet has a #[pallet::weight] directive which allows custom logic to be provided. We could augment submit_blob with a custom weigher that attempts to transform the blob size into a ref time and proof size. See frame/examples/basic. This is clunky and only applies to the submit_blob extrinsic. However, it would allow us to stick with a simple re-parameterization of the TargetedFeeUpdate rather than a port. For those reasons, I suggest we skip this tool.

Recommendations

With that in mind, I suggest that we use a combination of tool (1) and tool (2): we would use the NextFeeMultiplier from pallet_transaction_payment as a byte-length multiplier (with a hardcoded per-byte floor) and adjust the logic of the TargetedFeeUpdate to manage the 3 limiting factors: ref time, storage proof size, and byte size, with the expectation that byte size will be the real limiting factor. No special-casing will be done for submit_blob, and its weigher will focus entirely on standard weight.

This approach will also more easily allow us to adapt to #147 by accounting for skipped blocks in the fee multiplier.

In the future, this architecture will make it simple to split these multipliers into two according to the approaches in the literature.