Define batch kernel inputs/outputs

Fumuran commented 5 days ago

What should be done?

Define the kernel inputs and outputs for the batch of transactions.

How should it be done?

To be clarified

When is this task done?

To be clarified

Additional context

No response

bobbinth commented 1 day ago

The purpose of the batch kernel is to generate a single proof that we've verified some number of proven transactions. This would involve recursively verifying individual transaction proofs, which in turn, would require access to public inputs/outputs of the transaction kernel. As a reminder, these look like:

Inputs:  [BLOCK_HASH, account_id, INITIAL_ACCOUNT_HASH, INPUT_NOTES_COMMITMENT]
outputs: [OUTPUT_NOTES_COMMITMENT, FINAL_ACCOUNT_HASH, tx_expiration_block_num]

Where:

INPUT_NOTES_COMMITMENT is a sequential hash of (nullifier, empty_word_or_note_hash) tuples with empty_word_or_note_hash set to hash(note_id, note_metadata) for notes which are not authenticated by the transaction kernel.
OUTPUT_NOTES_COMMITMENT is a sequential hash of (note_id, note_metadata) tuples.

Batch kernel inputs

Probably the simplest approach is to set batch kernel inputs to a commitment of transactions included in the batch. This could be computed as a sequential hash of transaction_ids (a transaction ID is a commitment to initial/final account states and input/output note commitments). The kernel then would "unhash" these IDs non-deterministically into the underlying transaction inputs/outputs and verify the proofs against them.

Batch kernel outputs

Public outputs of the batch kernel need to include the following:

Commitment to all account modifications performed in this batch.
Commitment to all input notes consumed in this batch.
Commitment to all output notes produced by this batch.
batch expiration block number.

Account updates commitment

For each modified account, we need to capture the following information at the minimum:

Account ID.
Initial state before any transactions where applied to the account.
Final state after all the transactions were applied to the account.
List of transactions which affected the account.

However, it may be beneficial for forward-lookin purposes, keep track of which transaction resulted in which state. In this case, we could compute the commitment as follows:

hash(account_id || initial_state || tx_id_0 || state_after_tx_0 || tx_id_1 || state_after_tx_1 ...)

If we were to lay this out in terms of words, it would look something like this:

words 0, 1: [account_id, 0, 0, 0, INITIAL_ACCOUNT_HASH]
words 2, 3: [TX_0_ID, ACCOUNT_HASH_AFTER_TX_0]
words 4, 5: [TX_1_ID, ACCOUNT_HASH_AFTER_TX_1]

This arrangement also allows us to always hash double words which will be pretty efficient.

Input notes commitment

The simplest way to track input note commitments is to basically use the same scheme as used in the transaction kernel (i.e., hash of (nullifier, empty_word_or_note_hash) tuples). However, here too it may be beneficial to track ID of the transactions which consumed the notes. We could do this by laying out each not info as follows:

[TX_ID, NULLIIFER, NOTE_ID, NOTE_METADATA]

For authenticated notes NOTE_ID and NOTE_METADATA would be zeros.

Output notes comment

Commitment to the output notes must be a root of the batch note SMT - so, there isn't much flexibility here. However, if we wanted to track which transaction produce which output note, we'd also need a separate commitment to that. This could be a sequential hash of note data where each note is defined as:

[TX_ID, NOTE_ID]

Open questions

Is there a better way to track associations between transactions and account updates, input notes, and output notes?

bobbinth commented 1 day ago

A potentially simpler option is to rely on the transaction ID commitments to link transactions to accounts, input notes, and output notes. Then, batch kernel inputs/outputs would look roughly like so:

Inputs:  [TRANSACTIONS_COMMITMENT]
Outputs: [INPUT_NOTES_COMMITMENT, OUTPUT_NOTES_SMT_ROOT, batch_expiration_block_num]

Where:

TRANSACTIONS_COMMITMENT is just a sequential hash of (transaction_id, account_id) tuples.
INPUT_NOTES_COMMITMENT is a sequential hash of (nullifier, empty_word_or_note_hash) tuples.

Conceptually, inputs into the batch kernel would be a set of ordered ProvenTransaction objects, and the output would be something like a ProvenBatch object. This object would contain a set of VerifiedTransaction objects which could look something like this:

pub struct VerifiedTransaction {
    id: TransactionId,
    account_update: TxAccountUpdate,
    input_notes: InputNotes<InputNoteCommitment>,
    output_notes: OutputNotes,
}

A couple of general notes:

One of the reasons why we care about keeping track of transaction level info is that in the future we may need to resolve conflicts between batches (e.g., when two batches contain duplicate or conflicting transactions).
INPUT_NOTES_COMMITMENT is a separate output because the set of notes committed by this hash could be different from the set of notes committed to via the TRANSACTIONS_COMMITMENT. The reason for this is that some output notes may be consumed inside the batch (via unauthenticated input notes) and such notes will be eliminated from the list of input notes of the batch.

0xPolygonMiden / miden-base