Don‘t understand „free SSTORE“ framing

TimDaub commented 1 year ago

My noob-ish understanding of why SSTORE on ETH is expensive is that for proper decentralization, a node must be runnable by anyone who wants to. I‘m also understanding, that for the average node operator, at least today, the limiting factor is provisioning a machine with lots of disk space.

So two factors are limiting ETH to allow more tps:

How much disk space is currently used
How much additional disk space is necessary in the future (throughput).

In the Kindelua whitepaper, I then read:

Apps that involve huge amounts of state changes, such as games and exchanges, are too expensive on Ethereum, due to the high cost of the SSTORE opcode. Moreover, persisting state on Ethereum is complex and error-prone, due to the need of serializing structures into a map of 256-bit integers. Kindelia replaces the Merkle stores by reversible snapshots of HVM's heap, which makes the SSTORE opcode essentially free.

But my understanding had never been that persisting state in merkle trees has been the defining bottle neck. I can believe that this is a problem, but generally, I always thought that the limiting factor is configuring the ETH client such that a home staker can run it.

So in that case, if a hypothetical ETH implementation would integrate HVM‘s heap snapshots, they wouldn‘t reduce the gas cost of SSTORE, would they? Because the SSTORE has to be priced relatively to state growth. Or what am I missing?

steinerkelvin commented 1 year ago

A problem with Ethereum storage is that it mixes charging gas for two different things: computation and storage space.

This design does a couple of things differently to achieve better viability of IO-intensive apps.

Firstly, we have a compute limit and a separate storage growth limit. In a usual computer, those two things are different; your SSD can be writing at 100% and filling a lot of space, and your CPU would still be (mostly) free to keep doing its computations independently. So, each mined block issues some amount of computation users can perform plus an independent amount of space they can fill up.

In an eventual market of block resources, miners would charge for them independently. If your code uses a lot of computation but does not allocate space, the fixed amount of space issued can still be fully used by another code (that uses very little computation, perhaps). On Ethereum, when doing a lot of IO, you greatly limit the amount of computational work available as the IO cost must be high — orders of magnitude higher — to limit disk space usage, exactly as you said. Here, that doesn't happen. We have a fixed amount of space growth per block, and that's it.

Secondly, on Ethereum, you must pay to write to disk. Here you would have to pay only for the amount of allocated space. Suppose your contract changes its state but ends up filling the exact amount of space it was before (think of a map of user balances where you only swap a user's balance number); you don't pay for storage or IO. You only pay the mana (computation) you need to modify the term/expression. The act of updating the contract state is only pointing to this new expression (which is already in memory as it'll be stored), which makes it effectively free.

Contracts can even de-allocate space, which can be reused and sold. The only limit here is the global storage space limit, whose growth is indeed set to some kinda arbitrary number to keep the node able to run on cheap hardware.

VictorTaelin commented 1 year ago

If Ethereum adopted heap snapshots, that would be equivalent to it persisting MSTORE'd values after the transaction ends. If that was the case, reused SSTORE cost would decrease 100-fold, while allocation SSTORE cost would only decrease slightly. The entire point is that reused SSTORE doesn't grow memory at all, but still costs a lot (5000 gas) because merkle tree insertion is very expensive computationally. We just skip that.

TimDaub commented 1 year ago

OK thanks, this was actually really helpful, especially

The entire point is that reused SSTORE doesn't grow memory at all, but still costs a lot (5000 gas) because merkle tree insertion is very expensive computationally.

So now that his was cleared up by you, I've looked around the whitepaper and whitebook: But I haven't found a technical description of these reversible heaps. I know they're coming from the HVM, but say I wanted to build a state sync engine and use the Kindelia construction instead of Patricia Merkle trees, where would I find information on how to do it?

steinerkelvin commented 1 year ago

@TimDaub

Kindelia's memory uses an overlay of buffers to keep snapshots of previous states, very much akin to filesystem overlays.

drawing drawing

The concept translates directly: it gets memory cells from the upper layers if they are present and fallbacks to lower layers if not.

The naive way to keep these layers would be to add a new layer for each tick/block so we can reverse to any previous tick. But that way, not only the number of layers would grow linearly as time passes, but also memory accesses would be O(t).

We use a simple algorithm to keep only O(log(t)) layers (and so also logarithmic read cost). It's very simple, and I plan to write about it, but right now, I can only provide you with this reference algorithm.

kindelia / Kindelia

Don‘t understand „free SSTORE“ framing #251