Parallel synthesis of sub-circuits

kunxian-xia commented 1 year ago

Describe the feature you would like

We are currently parallelizing the witness assignment of the following sub-circuits:

[x] bytecode https://github.com/scroll-tech/zkevm-circuits/pull/530
[x] poseidon https://github.com/scroll-tech/zkevm-circuits/pull/867
- [x] https://github.com/scroll-tech/poseidon-circuit/pull/27
- [x] https://github.com/scroll-tech/poseidon-circuit/pull/24
[x] state https://github.com/scroll-tech/zkevm-circuits/pull/545
[ ] mpt
[ ] evm
[ ] halo2-lib related
- [ ] sig
- [ ] ecc
- [ ] compression
- [ ] aggregation

Additional context

No response

kunxian-xia commented 1 year ago

We will maintain a new branch in which all these optimization are enabled.

kunxian-xia commented 1 year ago

sub-circuit	sequential time	parallelized running time
bytecode	60s	3s
poseidon	158s	1.8s
state	170s	7s
ecdsa
evm

Velaciela commented 1 year ago

machine: AWS g5.x12large | 48core vCPU | 192GB RAM
benchmark branch:
- scroll-zkevm /realtrace0505 (commmit: 570fa72)
- zkevm-circuits/parallel-syn-dev (commmit: 8313a68)

multi-phase witness generation: 285s (previous: ~700s)

    Initialization: 15s
    phase_timer[0]: 121.979618499s
        advice vec init: 16.976s
        witness assignment: 48.383s
        batch invert witness assignment: 40.595s
        MSM: 11.926s
        NTT: 2.851s
        split & update: 4s
    phase_timer[1]: 74.267558757s
        advice vec init: 17.046s
        witness assignment: 51.739s
        batch invert witness assignment: 4.510s
        misc: 2s
    phase_timer[2]: 73.66961104s
        advice vec init: 16.826s
        witness assignment: 52.085s
        batch invert witness assignment: 3.605s
        misc: 1s

after we optimized time-consuming parts by https://github.com/scroll-tech/halo2-gpu/pull/66

parallel initialize advice columns
direct batch invert assignment

multi-phase witness generation: 285s >>> ~210s

    Initialization: 15s
    phase_timer[0]: 121.98s >>> 77.98s
        advice vec init: 16.976s >>> 1.5s
        batch invert witness assignment: 40.595s >>> 11s
        MSM: 11.926s
        NTT: 2.851s
        split & update: 4s
    phase_timer[1]: 59.26s
        advice vec init: 17.046s >>> 1.5s
        witness assignment: 51.739s
        batch invert witness assignment: 4.510s
        misc: 2s
    phase_timer[2]: 58.67s
        advice vec init: 16.826s >>> 1.5s
        witness assignment: 52.085s
        batch invert witness assignment: 3.605s
        misc: 1s

for a single phase witness assignment (~50s)

    evm_circuit(synthesize_sub): 27.457635586s
    mpt_circuit(synthesize_sub): 7.616027465s
    tx_circuit(synthesize_sub): 4.116668735s
    copy_circuit(synthesize_sub): 3.657522537s
    state_circuit(synthesize_sub): 2.739736719s
    keccak_circuit(synthesize_sub): 2.381289647s
    bytecode_circuit(synthesize_sub): 2.024037224s
    poseidon_circuit(synthesize_sub): 1.002937647s
    rlp_circuit(synthesize_sub): 349.832865ms
    exp_circuit(synthesize_sub): 199.796709ms

scroll-tech / zkevm-circuits

Parallel synthesis of sub-circuits #552

Describe the feature you would like

Additional context