scroll-tech / zkevm-circuits

MIT License
916 stars 386 forks source link

Parallel synthesis of sub-circuits #552

Open kunxian-xia opened 1 year ago

kunxian-xia commented 1 year ago

Describe the feature you would like

We are currently parallelizing the witness assignment of the following sub-circuits:

Additional context

No response

kunxian-xia commented 1 year ago

We will maintain a new branch in which all these optimization are enabled.

kunxian-xia commented 1 year ago
sub-circuit sequential time parallelized running time
bytecode 60s 3s
poseidon 158s 1.8s
state 170s 7s
ecdsa
evm
Velaciela commented 1 year ago


multi-phase witness generation: 285s (previous: ~700s)

    Initialization: 15s
    phase_timer[0]: 121.979618499s
        advice vec init: 16.976s
        witness assignment: 48.383s
        batch invert witness assignment: 40.595s
        MSM: 11.926s
        NTT: 2.851s
        split & update: 4s
    phase_timer[1]: 74.267558757s
        advice vec init: 17.046s
        witness assignment: 51.739s
        batch invert witness assignment: 4.510s
        misc: 2s
    phase_timer[2]: 73.66961104s
        advice vec init: 16.826s
        witness assignment: 52.085s
        batch invert witness assignment: 3.605s
        misc: 1s

after we optimized time-consuming parts by https://github.com/scroll-tech/halo2-gpu/pull/66

multi-phase witness generation: 285s >>> ~210s

    Initialization: 15s
    phase_timer[0]: 121.98s >>> 77.98s
        advice vec init: 16.976s >>> 1.5s
        batch invert witness assignment: 40.595s >>> 11s
        MSM: 11.926s
        NTT: 2.851s
        split & update: 4s
    phase_timer[1]: 59.26s
        advice vec init: 17.046s >>> 1.5s
        witness assignment: 51.739s
        batch invert witness assignment: 4.510s
        misc: 2s
    phase_timer[2]: 58.67s
        advice vec init: 16.826s >>> 1.5s
        witness assignment: 52.085s
        batch invert witness assignment: 3.605s
        misc: 1s

for a single phase witness assignment (~50s)

    evm_circuit(synthesize_sub): 27.457635586s
    mpt_circuit(synthesize_sub): 7.616027465s
    tx_circuit(synthesize_sub): 4.116668735s
    copy_circuit(synthesize_sub): 3.657522537s
    state_circuit(synthesize_sub): 2.739736719s
    keccak_circuit(synthesize_sub): 2.381289647s
    bytecode_circuit(synthesize_sub): 2.024037224s
    poseidon_circuit(synthesize_sub): 1.002937647s
    rlp_circuit(synthesize_sub): 349.832865ms
    exp_circuit(synthesize_sub): 199.796709ms