matter-labs / era-compiler-llvm

ZKsync fork of the LLVM framework.
Other
33 stars 16 forks source link

[EraVM] Disable splitting loop phi live ranges by default #709

Closed vladimirradosavljevic closed 1 month ago

vladimirradosavljevic commented 1 month ago

This optimization should only be used for EVMInterpreter.

github-actions[bot] commented 1 month ago
╔═╡ Size (-%) ╞════════════════╡ All M3B3 ╞═╗
║ Best                                0.000 ║
║ Worst                              -1.061 ║
║ Total                              -0.003 ║
╠═╡ Cycles (-%) ╞══════════════╡ All M3B3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -7.235 ║
║ Total                              -0.031 ║
╠═╡ Ergs (-%) ╞════════════════╡ All M3B3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -2.688 ║
║ Total                              -0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞════════════════╡ All MzB3 ╞═╗
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Cycles (-%) ╞══════════════╡ All MzB3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -7.235 ║
║ Total                              -0.028 ║
╠═╡ Ergs (-%) ╞════════════════╡ All MzB3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -2.688 ║
║ Total                              -0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞═════╡ EVMInterpreter M3B3 ╞═╗
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                                 NaN ║
╠═╡ Cycles (-%) ╞═══╡ EVMInterpreter M3B3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -7.235 ║
║ Total                              -4.525 ║
╠═╡ Ergs (-%) ╞═════╡ EVMInterpreter M3B3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -2.688 ║
║ Total                              -0.642 ║
╠═╡ Ergs/gas ╞══════╡ EVMInterpreter M3B3 ╞═╣
║ ADD                                40.750 ║
║ MUL                                24.450 ║
║ SUB                                40.750 ║
║ DIV                                26.850 ║
║ SDIV                               42.450 ║
║ MOD                                26.850 ║
║ SMOD                               40.050 ║
║ ADDMOD                             22.156 ║
║ MULMOD                             25.156 ║
║ EXP                                 7.237 ║
║ SIGNEXTEND                         25.650 ║
║ LT                                 44.750 ║
║ GT                                 44.750 ║
║ SLT                                66.750 ║
║ SGT                                66.750 ║
║ EQ                                 44.750 ║
║ ISZERO                             38.417 ║
║ AND                                40.750 ║
║ OR                                 40.750 ║
║ XOR                                40.750 ║
║ NOT                                34.417 ║
║ BYTE                               48.750 ║
║ SHL                                44.750 ║
║ SHR                                44.750 ║
║ SAR                                62.750 ║
║ SGT                                66.750 ║
║ SHA3                               26.922 ║
║ ADDRESS                            50.812 ║
║ BALANCE                            39.481 ║
║ ORIGIN                           1357.750 ║
║ CALLER                             50.812 ║
║ CALLVALUE                          50.812 ║
║ CALLDATALOAD                       36.750 ║
║ CALLDATASIZE                       51.125 ║
║ CALLDATACOPY                       51.462 ║
║ CODESIZE                           51.625 ║
║ CODECOPY                           62.456 ║
║ GASPRICE                         1354.562 ║
║ EXTCODESIZE                         3.732 ║
║ EXTCODECOPY                         3.797 ║
║ RETURNDATASIZE                     49.500 ║
║ RETURNDATACOPY                     44.556 ║
║ EXTCODEHASH                         4.926 ║
║ BLOCKHASH                         240.719 ║
║ COINBASE                         1354.750 ║
║ TIMESTAMP                        1348.750 ║
║ NUMBER                           1348.750 ║
║ PREVRANDAO                       1348.750 ║
║ GASLIMIT                         1354.750 ║
║ CHAINID                          1342.750 ║
║ SELFBALANCE                       641.700 ║
║ BASEFEE                          1348.750 ║
║ POP                                41.625 ║
║ MLOAD                              53.419 ║
║ MSTORE                             57.076 ║
║ MSTORE8                            66.598 ║
║ SLOAD                              20.782 ║
║ SSTORE                              4.667 ║
║ JUMP                               17.000 ║
║ JUMPI                              16.727 ║
║ PC                                 51.312 ║
║ MSIZE                              57.812 ║
║ GAS                                48.312 ║
║ JUMPDEST                           71.625 ║
║ PUSH0                              48.312 ║
║ PUSH1                              43.958 ║
║ PUSH2                              47.375 ║
║ PUSH4                              50.208 ║
║ PUSH5                              51.625 ║
║ PUSH6                              53.042 ║
║ PUSH7                              54.458 ║
║ PUSH8                              55.875 ║
║ PUSH9                              57.292 ║
║ PUSH10                             58.708 ║
║ PUSH11                             60.125 ║
║ PUSH12                             61.542 ║
║ PUSH13                             62.958 ║
║ PUSH14                             64.375 ║
║ PUSH15                             65.792 ║
║ PUSH16                             67.208 ║
║ PUSH17                             68.625 ║
║ PUSH18                             70.042 ║
║ PUSH19                             71.458 ║
║ PUSH20                             72.875 ║
║ PUSH21                             74.292 ║
║ PUSH22                             75.708 ║
║ PUSH23                             77.125 ║
║ PUSH24                             78.542 ║
║ PUSH25                             79.958 ║
║ PUSH26                             81.375 ║
║ PUSH27                             82.792 ║
║ PUSH28                             84.208 ║
║ PUSH29                             85.625 ║
║ PUSH30                             87.042 ║
║ PUSH31                             88.458 ║
║ PUSH32                             87.875 ║
║ DUP1                               34.417 ║
║ DUP2                               40.417 ║
║ DUP3                               40.417 ║
║ DUP4                               40.417 ║
║ DUP5                               40.417 ║
║ DUP6                               40.417 ║
║ DUP7                               40.417 ║
║ DUP8                               40.417 ║
║ DUP9                               40.417 ║
║ DUP10                              40.417 ║
║ DUP11                              40.417 ║
║ DUP12                              40.417 ║
║ DUP13                              40.417 ║
║ DUP14                              40.417 ║
║ DUP15                              40.417 ║
║ DUP16                              40.417 ║
║ SWAP1                              43.083 ║
║ SWAP2                              43.083 ║
║ SWAP3                              43.083 ║
║ SWAP4                              43.083 ║
║ SWAP5                              43.083 ║
║ SWAP6                              43.083 ║
║ SWAP7                              43.083 ║
║ SWAP8                              43.083 ║
║ SWAP9                              43.083 ║
║ SWAP10                             43.083 ║
║ SWAP11                             43.083 ║
║ SWAP12                             43.083 ║
║ SWAP13                             43.083 ║
║ SWAP14                             43.083 ║
║ SWAP15                             43.083 ║
║ SWAP16                             43.083 ║
║ CALL                               36.470 ║
║ STATICCALL                         35.506 ║
║ DELEGATECALL                       34.552 ║
║ CREATE                              4.089 ║
║ CREATE2                             5.568 ║
║ RETURN                              1.000 ║
║ REVERT                              1.000 ║
╠═╡ Ergs/gas (-%) ╞═╡ EVMInterpreter M3B3 ╞═╣
║ ADD                                -5.161 ║
║ MUL                                -5.161 ║
║ SUB                                -5.161 ║
║ DIV                                -4.678 ║
║ SDIV                               -2.909 ║
║ MOD                                -4.678 ║
║ SMOD                               -3.089 ║
║ ADDMOD                             -7.262 ║
║ MULMOD                             -6.341 ║
║ EXP                                -1.401 ║
║ SIGNEXTEND                         -4.908 ║
║ LT                                 -4.678 ║
║ GT                                 -4.678 ║
║ SLT                                -3.089 ║
║ SGT                                -3.089 ║
║ EQ                                 -4.678 ║
║ ISZERO                             -5.492 ║
║ AND                                -5.161 ║
║ OR                                 -5.161 ║
║ XOR                                -5.161 ║
║ NOT                                -6.170 ║
║ BYTE                               -4.278 ║
║ SHL                                -4.678 ║
║ SHR                                -4.678 ║
║ SAR                                -3.292 ║
║ SGT                                -3.089 ║
║ SHA3                               -2.265 ║
║ ADDRESS                            -6.275 ║
║ BALANCE                            -0.612 ║
║ ORIGIN                             -0.221 ║
║ CALLER                             -6.275 ║
║ CALLVALUE                          -6.275 ║
║ CALLDATALOAD                       -5.755 ║
║ CALLDATASIZE                       -6.234 ║
║ CALLDATACOPY                       -3.979 ║
║ CODESIZE                           -6.170 ║
║ CODECOPY                           -3.256 ║
║ GASPRICE                           -0.222 ║
║ EXTCODESIZE                        -0.186 ║
║ EXTCODECOPY                        -0.122 ║
║ RETURNDATASIZE                     -6.452 ║
║ RETURNDATACOPY                     -3.085 ║
║ EXTCODEHASH                        -0.235 ║
║ BLOCKHASH                          -0.125 ║
║ COINBASE                           -0.222 ║
║ TIMESTAMP                          -0.223 ║
║ NUMBER                             -0.223 ║
║ PREVRANDAO                         -0.223 ║
║ GASLIMIT                           -0.222 ║
║ CHAINID                            -0.224 ║
║ SELFBALANCE                        -0.187 ║
║ BASEFEE                            -0.223 ║
║ POP                                -7.767 ║
║ MLOAD                              -3.544 ║
║ MSTORE                             -3.310 ║
║ MSTORE8                            -2.909 ║
║ SLOAD                              -0.357 ║
║ SSTORE                             -0.164 ║
║ PC                                 -6.210 ║
║ MSIZE                              -5.473 ║
║ GAS                                -6.621 ║
║ JUMPDEST                           -9.143 ║
║ PUSH0                              -6.621 ║
║ PUSH1                              -4.767 ║
║ DUP2                              -10.984 ║
║ DUP3                              -10.984 ║
║ DUP4                              -10.984 ║
║ DUP5                              -10.984 ║
║ DUP6                              -10.984 ║
║ DUP7                              -10.984 ║
║ DUP8                              -10.984 ║
║ DUP9                              -10.984 ║
║ DUP10                             -10.984 ║
║ DUP11                             -10.984 ║
║ DUP12                             -10.984 ║
║ DUP13                             -10.984 ║
║ DUP14                             -10.984 ║
║ DUP15                             -10.984 ║
║ DUP16                             -10.984 ║
║ SWAP1                              -4.868 ║
║ SWAP2                              -4.868 ║
║ SWAP3                              -4.868 ║
║ SWAP4                              -4.868 ║
║ SWAP5                              -4.868 ║
║ SWAP6                              -4.868 ║
║ SWAP7                              -4.868 ║
║ SWAP8                              -4.868 ║
║ SWAP9                              -4.868 ║
║ SWAP10                             -4.868 ║
║ SWAP11                             -4.868 ║
║ SWAP12                             -4.868 ║
║ SWAP13                             -4.868 ║
║ SWAP14                             -4.868 ║
║ SWAP15                             -4.868 ║
║ SWAP16                             -4.868 ║
║ CALL                               -0.770 ║
║ STATICCALL                         -0.768 ║
║ DELEGATECALL                       -0.807 ║
║ CREATE                             -0.163 ║
║ CREATE2                            -0.173 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞═════╡ EVMInterpreter MzB3 ╞═╗
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                                 NaN ║
╠═╡ Cycles (-%) ╞═══╡ EVMInterpreter MzB3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -7.235 ║
║ Total                              -4.525 ║
╠═╡ Ergs (-%) ╞═════╡ EVMInterpreter MzB3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -2.688 ║
║ Total                              -0.642 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞════════╡ Precompiles M3B3 ╞═╗
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Cycles (-%) ╞══════╡ Precompiles M3B3 ╞═╣
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞════════╡ Precompiles M3B3 ╞═╣
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞════════╡ Precompiles MzB3 ╞═╗
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Cycles (-%) ╞══════╡ Precompiles MzB3 ╞═╣
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞════════╡ Precompiles MzB3 ╞═╣
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞══════════╡ Real life M3B3 ╞═╗
║ Best                                0.000 ║
║ Worst                              -1.061 ║
║ Total                              -0.069 ║
╠═╡ Cycles (-%) ╞════════╡ Real life M3B3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -0.419 ║
║ Total                              -0.201 ║
╠═╡ Ergs (-%) ╞══════════╡ Real life M3B3 ╞═╣
║ Best                                0.000 ║
║ Worst                              -0.348 ║
║ Total                              -0.050 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞══════════╡ Real life MzB3 ╞═╗
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Cycles (-%) ╞════════╡ Real life MzB3 ╞═╣
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞══════════╡ Real life MzB3 ╞═╣
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝
akiramenai commented 1 month ago

Based on the benchmarks, shouldn't we keep it enabled by default?

vladimirradosavljevic commented 1 month ago

Based on the benchmarks, shouldn't we keep it enabled by default?

Currently, improvements comes from https://github.com/matter-labs/era-compiler-llvm/issues/710 optimization, since it is fortunate that by splitting loop phi live ranges we are generating add instructions from which we are generating indexed load and store instructions. Since this is only intended for EVMInterpreter, I think it is safer to disable it by default and to enable it only for EVMInterpreter. I expect for #710 to bring more improvements for other contracts than we are seeing here.

akiramenai commented 1 month ago

It's still -0.2% cycles in real-life m3b3 in unintended but understandable manner. Indeed the change we done for EVMInterpreter could result in better use of incremented loads and stores though I agree if to implement #410 the expected effect should be greater. So I'd suggest to attach this patch to #410 and close it for now. When #410 is done, it's worth rechecking the effect of the default. wdyt?

vladimirradosavljevic commented 1 month ago

It's still -0.2% cycles in real-life m3b3 in unintended but understandable manner. Indeed the change we done for EVMInterpreter could result in better use of incremented loads and stores though I agree if to implement #410 the expected effect should be greater. So I'd suggest to attach this patch to #410 and close it for now. When #410 is done, it's worth rechecking the effect of the default. wdyt?

Even though it looks like a nice improvement for real-life m3b3, it is just for 1 test:

Group 'Real life M3B3' cycles (-%) worst 100 out of 1:
 -0.419   : E+M3B3 0.6.12            tests/solidity/complex/defi/starkex-verifier/test.json::dydx_proof_verification[#fallback:47]

I don't think it is worth to enable it by default, given improvements are only for one test for 0.6.12 + there is a possibility to break indexed loads and stores in other real world contracts. Does this make sense?

akiramenai commented 1 month ago

Ok, merging.