AztecProtocol / aztec-packages

Apache License 2.0
184 stars 184 forks source link

refactor: showcase recursion case for transfer #7271

Closed LHerskind closed 2 months ago

LHerskind commented 3 months ago

Fixes #7142 and #7362.

Instead of doing direct recursion on the transfer I have altered the balances_map slightly to include a new sub function that allow a reduction of notes WITHOUT inserting the change note. We can then recurse over this new sub function to spend notes until we get what we need and then insert a change note after that.

Notice that the transfer is consuming at most 2 notes, but that the _accumulate might consume up to 8!

The reason for this is fairly simple, the transfer would be inserting notes that are instantly consumed but spend plenty of juice on constraining the encryption first. Instead we can just spend a lot of notes with _accumulate without creating any new notes, and then return a final change value for the transfer to add for the from.

Since there are less "overhead" and we are just consuming notes in _accumulate we can spend the same number of constraints just consuming notes as we usually spend on consuming, encrypting and updating state.


During the testing figured something interesting, that might be an issue with compiler optimisation.

// Doing this 
let call = Token::at(context.this_address())._accumulate(from, missing.to_field());
if !missing.eq(U128::zero()) {
    change = call.call(&mut context);
}

// Is usually MUCH (many thousand constraints) cheaper than doing this
if !missing.eq(U128::zero()) {
    change = Token::at(context.this_address())._accumulate(from, missing.to_field()).call(&mut context);
}

LHerskind commented 3 months ago

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @LHerskind and the rest of your teammates on Graphite Graphite

AztecBot commented 3 months ago

Benchmark results

Metrics with a significant change:

Detailed results All benchmarks are run on txs on the `Benchmarking` contract on the repository. Each tx consists of a batch call to `create_note` and `increment_balance`, which guarantees that each tx has a private call, a nested private call, a public call, and a nested public call, as well as an emitted private note, an unencrypted log, and public storage read and write. This benchmark source data is available in JSON format on S3 [here](https://aztec-ci-artifacts.s3.us-east-2.amazonaws.com/benchmarks-v1/pulls/7271.json). ### Proof generation Each column represents the number of threads used in proof generation. | Metric | 1 threads | 4 threads | 16 threads | 32 threads | 64 threads | | - | - | - | - | - | - | proof_construction_time_sha256_ms | 5,719 (-1%) | 1,550 (-1%) | 708 | 762 | 774 | proof_construction_time_sha256_30_ms | 11,933 (+1%) | 3,151 (-1%) | 1,408 | 1,427 (+1%) | 1,470 | proof_construction_time_sha256_100_ms | 43,889 | 11,794 | 5,459 (+1%) | 5,429 | 5,359 | proof_construction_time_poseidon_hash_ms | 78.0 (-1%) | 34.0 | 34.0 | 58.0 | 88.0 (+1%) | proof_construction_time_poseidon_hash_30_ms | 1,517 (-1%) | 416 | 201 (+1%) | 231 (+1%) | 268 (+1%) | proof_construction_time_poseidon_hash_100_ms | 5,736 (-1%) | 1,563 (-1%) | 718 | 776 (+2%) | 788 (-1%) | ### L2 block published to L1 Each column represents the number of txs on an L2 block published to L1. | Metric | 4 txs | 8 txs | 16 txs | | - | - | - | - | l1_rollup_calldata_size_in_bytes | 1,412 | 1,412 | 1,412 | l1_rollup_calldata_gas | 9,476 | 9,466 | 9,476 | l1_rollup_execution_gas | 611,215 | 611,356 | 611,517 | l2_block_processing_time_in_ms | 749 (-1%) | 1,405 | 2,714 (-1%) | l2_block_building_time_in_ms | 18,979 | 37,707 (+1%) | 74,011 | l2_block_rollup_simulation_time_in_ms | 18,979 | 37,707 (+1%) | 74,011 | l2_block_public_tx_process_time_in_ms | 16,191 | 34,750 (+1%) | 71,067 | ### L2 chain processing Each column represents the number of blocks on the L2 chain where each block has 8 txs. | Metric | 3 blocks | 5 blocks | | - | - | - | node_history_sync_time_in_ms | 7,003 | 9,849 | node_database_size_in_bytes | 12,423,248 | 16,195,664 | pxe_database_size_in_bytes | 16,254 | 26,813 | ### Circuits stats Stats on running time and I/O sizes collected for every kernel circuit run across all benchmarks. | Circuit | simulation_time_in_ms | witness_generation_time_in_ms | proving_time_in_ms | input_size_in_bytes | output_size_in_bytes | proof_size_in_bytes | num_public_inputs | size_in_gates | | - | - | - | - | - | - | - | - | - | private-kernel-init | 110 (+1%) | 399 (-3%) | 13,607 (-5%) | 19,990 | 55,022 | 74,432 | 2,259 | 524,288 | private-kernel-inner | 221 | :warning: 844 (**+20%**) | 30,955 (+2%) | 82,090 | 55,022 | 74,432 | 2,259 | 1,048,576 | private-kernel-tail | 1,058 | 2,502 (-1%) | 54,392 (+1%) | 62,345 | 62,057 | 14,912 | 399 | 2,097,152 | base-parity | 6.16 | 1,319 (+1%) | 2,639 | 128 | 64.0 | 2,208 | 2.00 | 131,072 | root-parity | 48.6 | 69.2 (+3%) | 38,839 (-2%) | 27,100 | 64.0 | 2,720 | 18.0 | 2,097,152 | base-rollup | 5,951 (+1%) | 4,781 (-2%) | 92,021 | 170,330 | 728 | 3,648 | 47.0 | 4,194,304 | root-rollup | 110 | 81.8 (+1%) | 23,055 (-4%) | 25,253 | 620 | 3,456 | 41.0 | 1,048,576 | public-kernel-setup | 433 (-1%) | 2,399 | 42,257 | 102,121 | 80,278 | 106,912 | 3,274 | 2,097,152 | public-kernel-app-logic | 397 (+1%) | 3,386 (+1%) | 41,388 (-6%) | 102,121 | 80,278 | 106,912 | 3,274 | 2,097,152 | public-kernel-tail | 1,136 | 26,861 (-1%) | 180,281 | 399,014 | 10,014 | 14,912 | 399 | 8,388,608 | private-kernel-reset-small | 291 | 1,159 (-13%) | 30,106 (+14%) | 79,209 | 55,022 | 74,432 | 2,259 | 1,048,576 | public-kernel-teardown | 379 (-1%) | 3,386 (+1%) | 43,395 (-1%) | 102,121 | 80,278 | 106,912 | 3,274 | 2,097,152 | merge-rollup | 29.1 (-1%) | N/A | N/A | 16,486 | 728 | N/A | N/A | N/A | private-kernel-tail-to-public | N/A | 8,533 | 58,809 (-3%) | N/A | N/A | 106,912 | 3,274 | 2,097,152 | Stats on running time collected for app circuits | Function | input_size_in_bytes | output_size_in_bytes | witness_generation_time_in_ms | proof_size_in_bytes | proving_time_in_ms | size_in_gates | num_public_inputs | | - | - | - | - | - | - | - | - | ContractClassRegisterer:register | 1,344 | 9,364 | 396 (+1%) | N/A | N/A | N/A | N/A | ContractInstanceDeployer:deploy | 1,408 | 9,364 | 24.6 | N/A | N/A | N/A | N/A | MultiCallEntrypoint:entrypoint | 1,920 | 9,364 | 648 (-1%) | N/A | N/A | N/A | N/A | GasToken:deploy | 1,376 | 9,364 | 549 | N/A | N/A | N/A | N/A | SchnorrAccount:constructor | 1,312 | 9,364 | 478 (-2%) | N/A | N/A | N/A | N/A | SchnorrAccount:entrypoint | 2,304 | 9,364 | 790 (-4%) | 16,000 | :warning: 6,384 (**+34%**) | 131,072 | 433 | Token:privately_mint_private_note | 1,280 | 9,364 | 530 (-1%) | N/A | N/A | N/A | N/A | FPC:fee_entrypoint_public | 1,344 | 9,364 | 94.0 (+1%) | 16,000 | 1,750 (+5%) | 65,536 | 433 | Token:transfer | 1,312 | 9,364 | :warning: 1,126 (**-31%**) | 16,000 | :warning: 6,141 (**-47%**) | :warning: 262,144 (**-50%**) | 433 | AuthRegistry:set_authorized (avm) | 19,226 | N/A | N/A | 91,264 | 1,353 (-3%) | N/A | N/A | FPC:prepare_fee (avm) | 26,668 | N/A | N/A | 91,328 | 2,811 (-3%) | N/A | N/A | Token:transfer_public (avm) | 42,918 | N/A | N/A | 91,328 | 3,895 | N/A | N/A | AuthRegistry:consume (avm) | 33,104 | N/A | N/A | 91,264 | 2,777 (-5%) | N/A | N/A | FPC:pay_refund (avm) | 36,833 | N/A | N/A | 91,296 | 23,537 (+1%) | N/A | N/A | Benchmarking:create_note | 1,344 | 9,364 | 464 | N/A | N/A | N/A | N/A | SchnorrAccount:verify_private_authwit | 1,280 | 9,364 | 41.3 | N/A | N/A | N/A | N/A | Token:unshield | 1,376 | 9,364 | 1,311 | N/A | N/A | N/A | N/A | FPC:fee_entrypoint_private | 1,376 | 9,364 | 1,705 (-6%) | N/A | N/A | N/A | N/A | ### AVM Simulation Time to simulate various public functions in the AVM. | Function | time_ms | bytecode_size_in_bytes | | - | - | - | GasToken:_increase_public_balance | 67.5 (-8%) | 13,790 | GasToken:set_portal | 14.0 (-6%) | 3,339 | Token:constructor | 93.3 (+2%) | 23,692 | FPC:constructor | 61.5 | 13,592 | GasToken:mint_public | 48.7 (-3%) | 10,158 | Token:mint_public | :warning: 47.4 (**-39%**) | 19,034 | Token:assert_minter_and_mint | 130 (-12%) | 12,925 | AuthRegistry:set_authorized | 27.4 (-1%) | 7,812 | FPC:prepare_fee | 105 (-3%) | 15,062 | Token:transfer_public | 35.5 (+23%) | 31,218 | FPC:pay_refund | 135 (+5%) | 25,260 | Benchmarking:increment_balance | 1,985 (+1%) | 15,267 | Token:_increase_public_balance | 14.9 | 15,006 | FPC:pay_refund_with_shielded_rebate | 154 (+2%) | 26,347 | ### Public DB Access Time to access various public DBs. | Function | time_ms | | - | - | get-nullifier-index | 0.154 | ### Tree insertion stats The duration to insert a fixed batch of leaves into each tree type. | Metric | 1 leaves | 16 leaves | 64 leaves | 128 leaves | 256 leaves | 512 leaves | 1024 leaves | | - | - | - | - | - | - | - | - | batch_insert_into_append_only_tree_16_depth_ms | 10.4 | 16.7 | N/A | N/A | N/A | N/A | N/A | batch_insert_into_append_only_tree_16_depth_hash_count | 16.8 | 31.7 | N/A | N/A | N/A | N/A | N/A | batch_insert_into_append_only_tree_16_depth_hash_ms | 0.601 | 0.513 | N/A | N/A | N/A | N/A | N/A | batch_insert_into_append_only_tree_32_depth_ms | N/A | N/A | 48.0 | 75.4 | 130 (-2%) | 244 | 469 (-1%) | batch_insert_into_append_only_tree_32_depth_hash_count | N/A | N/A | 95.9 | 159 | 287 | 543 | 1,055 | batch_insert_into_append_only_tree_32_depth_hash_ms | N/A | N/A | 0.491 | 0.464 | 0.448 (-1%) | 0.442 | 0.438 (-1%) | batch_insert_into_indexed_tree_20_depth_ms | N/A | N/A | 59.4 | 111 (-1%) | 181 (-2%) | 352 | 691 (-1%) | batch_insert_into_indexed_tree_20_depth_hash_count | N/A | N/A | 109 | 207 | 355 | 691 | 1,363 | batch_insert_into_indexed_tree_20_depth_hash_ms | N/A | N/A | 0.502 | 0.498 | 0.481 (-1%) | 0.477 | 0.474 (-1%) | batch_insert_into_indexed_tree_40_depth_ms | N/A | N/A | 72.5 | N/A | N/A | N/A | N/A | batch_insert_into_indexed_tree_40_depth_hash_count | N/A | N/A | 133 | N/A | N/A | N/A | N/A | batch_insert_into_indexed_tree_40_depth_hash_ms | N/A | N/A | 0.516 | N/A | N/A | N/A | N/A | ### Miscellaneous Transaction sizes based on how many contract classes are registered in the tx. | Metric | 0 registered classes | 1 registered classes | | - | - | - | tx_size_in_bytes | 74,057 | 667,841 | Transaction size based on fee payment method | Metric | | | - | |
nventuro commented 2 months ago

I'm splitting this into a couple smaller PRs since there's quite a few things going on at the same time. We''ll still do this, but I want to keep the unrelated changes separate if at all possible. Created https://github.com/AztecProtocol/aztec-packages/pull/7621 for zero note retrieval (which by itself has a bunch of repercussions).

nventuro commented 2 months ago

Superceded by https://github.com/AztecProtocol/aztec-packages/pull/7730