AztecProtocol / aztec-packages

Apache License 2.0
155 stars 157 forks source link

refactor: Optimize private call stack item hash for gate count #7285

Closed LHerskind closed 14 hours ago

LHerskind commented 3 days ago

Fixes #7092.

In particular notice the costs around entrypoints ๐Ÿ‘€ 50% reduction, cutting away 1 million constraints.


Tests passed in CI earlier, but I updated a big stack that seemed to kill it as it could not run 4 branches in CI for me at the same time :)

LHerskind commented 3 days ago

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @LHerskind and the rest of your teammates on Graphite Graphite

AztecBot commented 3 days ago

Benchmark results

Metrics with a significant change:

Detailed results All benchmarks are run on txs on the `Benchmarking` contract on the repository. Each tx consists of a batch call to `create_note` and `increment_balance`, which guarantees that each tx has a private call, a nested private call, a public call, and a nested public call, as well as an emitted private note, an unencrypted log, and public storage read and write. This benchmark source data is available in JSON format on S3 [here](https://aztec-ci-artifacts.s3.us-east-2.amazonaws.com/benchmarks-v1/pulls/7285.json). ### Proof generation Each column represents the number of threads used in proof generation. | Metric | 1 threads | 4 threads | 16 threads | 32 threads | 64 threads | | - | - | - | - | - | - | proof_construction_time_sha256_ms | 5,532 (-3%) | 1,509 (-2%) | 702 | 729 (-3%) | 765 (-1%) | proof_construction_time_sha256_30_ms | 11,374 (-3%) | 3,297 (+5%) | 1,400 | 1,432 | 1,465 | proof_construction_time_sha256_100_ms | 42,393 (-3%) | 11,474 (-2%) | 5,397 | 5,410 (+1%) | 5,328 | proof_construction_time_poseidon_hash_ms | 78.0 | 34.0 | 34.0 | 57.0 | 88.0 | proof_construction_time_poseidon_hash_30_ms | 1,465 (-3%) | 415 | 199 | 228 (+3%) | 264 | proof_construction_time_poseidon_hash_100_ms | 5,552 (-3%) | 1,514 (-3%) | 718 | 755 (-2%) | 792 | ### L2 block published to L1 Each column represents the number of txs on an L2 block published to L1. | Metric | 4 txs | 8 txs | 16 txs | | - | - | - | - | l1_rollup_calldata_size_in_bytes | 1,412 | 1,412 | 1,412 | l1_rollup_calldata_gas | 9,464 | 9,472 | 9,464 | l1_rollup_execution_gas | 611,203 | 611,362 | 611,505 | l2_block_processing_time_in_ms | 746 (-2%) | 1,410 | 2,649 (-3%) | l2_block_building_time_in_ms | 20,153 (-3%) | 41,218 (-1%) | 78,575 (-4%) | l2_block_rollup_simulation_time_in_ms | 20,152 (-3%) | 41,218 (-1%) | 78,574 (-4%) | l2_block_public_tx_process_time_in_ms | 17,203 (-3%) | 38,036 (-2%) | 75,496 (-4%) | ### L2 chain processing Each column represents the number of blocks on the L2 chain where each block has 8 txs. | Metric | 3 blocks | 5 blocks | | - | - | - | node_history_sync_time_in_ms | 7,036 | 9,915 (+1%) | node_database_size_in_bytes | 12,488,784 | 16,175,184 | pxe_database_size_in_bytes | 16,254 | 26,813 | ### Circuits stats Stats on running time and I/O sizes collected for every kernel circuit run across all benchmarks. | Circuit | simulation_time_in_ms | witness_generation_time_in_ms | proving_time_in_ms | input_size_in_bytes | output_size_in_bytes | proof_size_in_bytes | num_public_inputs | size_in_gates | | - | - | - | - | - | - | - | - | - | private-kernel-init | 108 (+8%) | 411 (+9%) | 12,494 (-4%) | 19,990 (+3%) | 55,022 (+2%) | 74,432 (+4%) | 2,259 (+4%) | 524,288 | private-kernel-inner | :warning: 219 (**-28%**) | 790 (+7%) | :warning: 27,808 (**-43%**) | 82,090 (+2%) | 55,022 (+2%) | 74,432 (+4%) | 2,259 (+4%) | :warning: 1,048,576 (**-50%**) | private-kernel-tail | 1,035 (-1%) | 2,559 (+5%) | 55,768 (+15%) | 62,345 (+2%) | 62,057 | 14,912 | 399 | 2,097,152 | base-parity | 6.14 (-3%) | 1,549 (+3%) | 2,653 | 128 | 64.0 | 2,208 | 2.00 | 131,072 | root-parity | 48.2 (-2%) | 67.8 | 41,357 (-2%) | 27,100 | 64.0 | 2,720 | 18.0 | 2,097,152 | base-rollup | 6,427 (-2%) | 4,977 (+4%) | 93,966 (+5%) | 170,330 | 728 | 3,648 | 47.0 | 4,194,304 | root-rollup | 112 (+2%) | 85.1 (+3%) | 24,401 (+2%) | 25,253 | 620 | 3,456 | 41.0 | 1,048,576 | public-kernel-setup | 534 (-2%) | 2,447 (+3%) | 42,306 (-1%) | 102,121 | 80,278 | 106,912 | 3,274 | 2,097,152 | public-kernel-app-logic | 494 (-2%) | 3,355 (+3%) | 46,162 (+7%) | 102,121 | 80,278 | 106,912 | 3,274 | 2,097,152 | public-kernel-tail | 1,121 (-2%) | 27,129 (+1%) | 181,650 (+2%) | 399,014 | 10,014 | 14,912 | 399 | 8,388,608 | private-kernel-reset-small | 289 (+1%) | 1,185 (+10%) | 30,780 (+1%) | 79,209 (+1%) | 55,022 (+2%) | 74,432 (+4%) | 2,259 (+4%) | 1,048,576 | public-kernel-teardown | 485 (-2%) | 3,428 (+5%) | 44,944 (+5%) | 102,121 | 80,278 | 106,912 | 3,274 | 2,097,152 | merge-rollup | 29.0 (-1%) | N/A | N/A | 16,486 | 728 | N/A | N/A | N/A | private-kernel-tail-to-public | N/A | 8,993 (+7%) | 58,163 (+14%) | N/A | N/A | 106,912 | 3,274 | 2,097,152 | Stats on running time collected for app circuits | Function | input_size_in_bytes | output_size_in_bytes | witness_generation_time_in_ms | proof_size_in_bytes | proving_time_in_ms | size_in_gates | num_public_inputs | | - | - | - | - | - | - | - | - | ContractClassRegisterer:register | 1,344 | 9,364 (+7%) | 390 (-1%) | N/A | N/A | N/A | N/A | ContractInstanceDeployer:deploy | 1,408 | 9,364 (+7%) | 24.3 (+1%) | N/A | N/A | N/A | N/A | MultiCallEntrypoint:entrypoint | 1,920 | 9,364 (+7%) | :warning: 920 (**-20%**) | N/A | N/A | N/A | N/A | GasToken:deploy | 1,376 | 9,364 (+7%) | :warning: 689 (**-16%**) | N/A | N/A | N/A | N/A | SchnorrAccount:constructor | 1,312 | 9,364 (+7%) | 469 (-1%) | N/A | N/A | N/A | N/A | SchnorrAccount:entrypoint | 2,304 | 9,364 (+7%) | :warning: 1,275 (**-21%**) | 16,000 (+9%) | :warning: 26,760 (**-49%**) | :warning: 1,048,576 (**-50%**) | 433 (+10%) | Token:privately_mint_private_note | 1,280 | 9,364 (+7%) | 654 (+8%) | N/A | N/A | N/A | N/A | FPC:fee_entrypoint_public | 1,344 | 9,364 (+7%) | 232 (-7%) | 16,000 (+9%) | 10,957 (-2%) | 524,288 | 433 (+10%) | Token:transfer | 1,312 | 9,364 (+7%) | 1,700 (-5%) | 16,000 (+9%) | 12,587 | 524,288 | 433 (+10%) | AuthRegistry:set_authorized (avm) | 19,226 | N/A | N/A | 91,264 | 1,266 (-9%) | N/A | N/A | FPC:prepare_fee (avm) | 26,668 | N/A | N/A | 91,328 | 2,670 (-7%) | N/A | N/A | Token:transfer_public (avm) | 42,918 | N/A | N/A | 91,328 | 3,722 (-5%) | N/A | N/A | AuthRegistry:consume (avm) | 33,104 | N/A | N/A | 91,264 | 2,773 (-6%) | N/A | N/A | FPC:pay_refund (avm) | 36,833 | N/A | N/A | 91,296 | 23,721 (+2%) | N/A | N/A | Benchmarking:create_note | 1,344 | 9,364 (+7%) | 479 (+3%) | N/A | N/A | N/A | N/A | SchnorrAccount:verify_private_authwit | 1,280 | 9,364 (+7%) | 41.2 (+2%) | N/A | N/A | N/A | N/A | Token:unshield | 1,376 | 9,364 (+7%) | 1,392 (-5%) | N/A | N/A | N/A | N/A | FPC:fee_entrypoint_private | 1,376 | 9,364 (+7%) | 1,854 (-7%) | N/A | N/A | N/A | N/A | ### AVM Simulation Time to simulate various public functions in the AVM. | Function | time_ms | bytecode_size_in_bytes | | - | - | - | GasToken:_increase_public_balance | 66.3 (-8%) | 13,790 | GasToken:set_portal | 13.0 (+6%) | 3,339 | Token:constructor | 91.1 (-3%) | 23,692 | FPC:constructor | 63.2 (+23%) | 13,592 | GasToken:mint_public | 49.5 (-1%) | 10,158 | Token:mint_public | :warning: 487 (**+953%**) | 19,034 | Token:assert_minter_and_mint | :warning: 54.7 (**-70%**) | 12,925 | AuthRegistry:set_authorized | 31.5 (+3%) | 7,812 | FPC:prepare_fee | 110 (-4%) | 15,062 | Token:transfer_public | 31.0 (-13%) | 31,218 | FPC:pay_refund | 132 (-3%) | 25,260 | Benchmarking:increment_balance | 2,152 (-2%) | 15,267 | Token:_increase_public_balance | 62.5 (+8%) | 15,006 | FPC:pay_refund_with_shielded_rebate | 124 (+2%) | 26,347 | ### Public DB Access Time to access various public DBs. | Function | time_ms | | - | - | get-nullifier-index | 0.155 (-3%) | ### Tree insertion stats The duration to insert a fixed batch of leaves into each tree type. | Metric | 1 leaves | 16 leaves | 64 leaves | 128 leaves | 256 leaves | 512 leaves | 1024 leaves | | - | - | - | - | - | - | - | - | batch_insert_into_append_only_tree_16_depth_ms | 10.3 (-1%) | 16.6 | N/A | N/A | N/A | N/A | N/A | batch_insert_into_append_only_tree_16_depth_hash_count | 16.8 | 31.7 | N/A | N/A | N/A | N/A | N/A | batch_insert_into_append_only_tree_16_depth_hash_ms | 0.596 | 0.510 | N/A | N/A | N/A | N/A | N/A | batch_insert_into_append_only_tree_32_depth_ms | N/A | N/A | 47.7 (-1%) | 75.1 (-1%) | 128 (-4%) | 244 | 458 (-3%) | batch_insert_into_append_only_tree_32_depth_hash_count | N/A | N/A | 95.9 | 159 | 287 | 543 | 1,055 | batch_insert_into_append_only_tree_32_depth_hash_ms | N/A | N/A | 0.487 (-1%) | 0.462 (-1%) | 0.440 (-3%) | 0.443 | 0.427 (-3%) | batch_insert_into_indexed_tree_20_depth_ms | N/A | N/A | 58.6 (-1%) | 110 (-2%) | 177 (-4%) | 353 | 673 (-3%) | batch_insert_into_indexed_tree_20_depth_hash_count | N/A | N/A | 109 | 207 | 355 | 691 | 1,363 | batch_insert_into_indexed_tree_20_depth_hash_ms | N/A | N/A | 0.495 (-1%) | 0.493 (-1%) | 0.470 (-4%) | 0.477 | 0.461 (-3%) | batch_insert_into_indexed_tree_40_depth_ms | N/A | N/A | 72.0 (-1%) | N/A | N/A | N/A | N/A | batch_insert_into_indexed_tree_40_depth_hash_count | N/A | N/A | 133 | N/A | N/A | N/A | N/A | batch_insert_into_indexed_tree_40_depth_hash_ms | N/A | N/A | 0.512 (-1%) | N/A | N/A | N/A | N/A | ### Miscellaneous Transaction sizes based on how many contract classes are registered in the tx. | Metric | 0 registered classes | 1 registered classes | | - | - | - | tx_size_in_bytes | 74,057 | 667,841 | Transaction size based on fee payment method | Metric | | | - | |
github-actions[bot] commented 2 days ago

Changes to circuit sizes

Generated at commit: fe6c86b653647037061ae4b5018eaa700027fd16, compared to commit: e431b6f64937e8fba357f890bdc3042adf17cb44

๐Ÿงพ Summary (100% most significant diffs)

Program ACIR opcodes (+/-) % Circuit size (+/-) %
private_kernel_init_simulated 0 โž– 0.00% +80 โŒ +3.69%
private_kernel_inner_simulated 0 โž– 0.00% +80 โŒ +3.69%
private_kernel_reset_simulated 0 โž– 0.00% +80 โŒ +3.69%
private_kernel_reset_simulated_big 0 โž– 0.00% +80 โŒ +3.69%
private_kernel_reset_simulated_medium 0 โž– 0.00% +80 โŒ +3.69%
private_kernel_reset_simulated_small 0 โž– 0.00% +80 โŒ +3.69%
private_kernel_init +1,240 โŒ +5.04% +7,845 โŒ +2.10%
private_kernel_reset_small +128 โŒ +0.21% +12,887 โŒ +1.58%
private_kernel_reset_medium +128 โŒ +0.18% +12,887 โŒ +1.37%
private_kernel_reset_big +128 โŒ +0.14% +12,887 โŒ +1.09%
private_kernel_tail +197 โŒ +0.93% +13,057 โŒ +1.06%
private_kernel_tail_to_public +197 โŒ +0.04% +13,057 โŒ +0.78%
private_kernel_reset +128 โŒ +0.10% +12,887 โŒ +0.77%
private_kernel_inner +1,840 โŒ +4.30% -101,731 โœ… -8.86%

Full diff report ๐Ÿ‘‡
| Program | ACIR opcodes (+/-) | % | Circuit size (+/-) | % | |:-|-:|-:|-:|-:| | **private_kernel_init_simulated** | 1 (0) | **0.00%** | 2,248 (+80) | **+3.69%** | | **private_kernel_inner_simulated** | 1 (0) | **0.00%** | 2,248 (+80) | **+3.69%** | | **private_kernel_reset_simulated** | 1 (0) | **0.00%** | 2,248 (+80) | **+3.69%** | | **private_kernel_reset_simulated_big** | 1 (0) | **0.00%** | 2,248 (+80) | **+3.69%** | | **private_kernel_reset_simulated_medium** | 1 (0) | **0.00%** | 2,248 (+80) | **+3.69%** | | **private_kernel_reset_simulated_small** | 1 (0) | **0.00%** | 2,248 (+80) | **+3.69%** | | **private_kernel_init** | 25,831 (+1,240) | **+5.04%** | 380,757 (+7,845) | **+2.10%** | | **private_kernel_reset_small** | 62,343 (+128) | **+0.21%** | 828,946 (+12,887) | **+1.58%** | | **private_kernel_reset_medium** | 72,338 (+128) | **+0.18%** | 952,765 (+12,887) | **+1.37%** | | **private_kernel_reset_big** | 92,327 (+128) | **+0.14%** | 1,200,482 (+12,887) | **+1.09%** | | **private_kernel_tail** | 21,377 (+197) | **+0.93%** | 1,246,443 (+13,057) | **+1.06%** | | **private_kernel_tail_to_public** | 444,595 (+197) | **+0.04%** | 1,679,771 (+13,057) | **+0.78%** | | **private_kernel_reset** | 132,303 (+128) | **+0.10%** | 1,695,914 (+12,887) | **+0.77%** | | **private_kernel_inner** | 44,605 (+1,840) | **+4.30%** | 1,045,990 (-101,731) | **-8.86%** |