chainwayxyz / citrea

Citrea, Bitcoin's First ZK Rollup šŸŠšŸ‹
https://citrea.xyz
GNU General Public License v3.0
129 stars 26 forks source link

Stateful Statediff: RepeatedWrites (compress keys) #1512

Open kpp opened 6 days ago

kpp commented 6 days ago

Introduction

This issue belongs to https://github.com/chainwayxyz/citrea/issues/1237.

Value compression (U256 and code hash) is implemented in https://github.com/chainwayxyz/citrea/issues/1505 and https://github.com/chainwayxyz/citrea/issues/1510.

So now when calculating statediff for evm::Account(address) we store [address(20) diff(balance) diff(nonce) diff(code_hash)] instead of [address(20) balance(32) nonce(8) code_hash(33)]. And for evm::storage(address, key, value) we store [address(20) key(32) diff(value)] instead of [address(20) key(32) value(32)].

However there is a room for optimization how we can store addresses. In zksync(https://docs.zksync.io/build/developer-reference/handling-pubdata-in-boojum) they store u64 as a key for RepeatedWrites (it's a write for a key that was already written before).

zksync uses this struct:

pub struct StateDiffRecord {
    /// address state diff occurred at
    pub address: Address,
    /// storage slot key updated
    pub key: U256,
    /// derived_key == Blake2s(bytes32(address), key)
    pub derived_key: [u8; 32],
    /// index in tree of state diff
    pub enumeration_index: u64,
    /// previous value
    pub initial_value: U256,
    /// updated value
    pub final_value: U256,
}

And their account::{balance, nonce, code_hash} indeed have a fake keys in evm storage:

pub fn get_nonce_key(account: &Address) -> StorageKey {
    let nonce_manager = AccountTreeId::new(NONCE_HOLDER_ADDRESS);

    // The `minNonce` (used as nonce for EOAs) is stored in a mapping inside the `NONCE_HOLDER` system contract
    let key = get_address_mapping_key(account, H256::zero());

    StorageKey::new(nonce_manager, key)
}

Benefits: Zksync store account data in evm::storage. We don't do that (yet?). And their evm::storage(address, key, value) for InitialWrites is [derive_key(address(20), key(32)) diff(value)] instead of [address(20) key(32) value(32)] and [index(u64) diff(value)] for RepeatedWrites respectfully.

Conclusion: We don't need to optimize the other keys for read/write operations except for evm::Account and evm::storage because other operations can be compressed using different techniques.

Integration

When we deal with storage keys we calculate derived_key:

        // Compute the jmt update from the write batch
        let batch = state_accesses
            .ordered_writes
            .into_iter()
            .map(|(key, value)| {
                let key_hash = KeyHash::with::<H>(key.key.as_ref()); // <- This is our derived key

                let key_bytes = Arc::try_unwrap(key.key).unwrap_or_else(|arc| (*arc).clone());
                let value_bytes =
                    value.map(|v| Arc::try_unwrap(v.value).unwrap_or_else(|arc| (*arc).clone()));

                diff.push((key_bytes, value_bytes.clone()));

                (key_hash, value_bytes)
            })
            .collect::<Vec<_>>();

However I have no idea what is and index for RepeatedWrites.

The issue will be updated.

TBD

kpp commented 2 days ago

From Esad:

instead of working inside compute_state_update, since we work over N blocks in a single batch proof, after we run every block, we invoke a new functioncompute_compressed_state_diff( first_l2_prev_state_root_in_circuit compression_witness, new values ) first_l2_prev_state_root_in_circuit: prev state root of the first soft confirmation run in the circuit. compression_witness: val + proof for the initial value of the key new values

so we have old, new for every key we do compress_statefull(old, key) compression witness is prepared seperately from witness or offchain witness and passed as an input to the circuit (or we can simply append another witness to the BatchProofCircuitInput::witnessed field, since its already a Vec over Witness)

at the end of run_sequencer_commitments_in_da_slot (edited) or end of apply_soft_confirmations_from_sequencer_commitments