Closed vladimirradosavljevic closed 1 month ago
╔═╡ Size (-%) ╞════════════════╡ All M3B3 ╞═╗
║ Best 0.000 ║
║ Worst -1.061 ║
║ Total -0.003 ║
╠═╡ Cycles (-%) ╞══════════════╡ All M3B3 ╞═╣
║ Best 0.000 ║
║ Worst -7.235 ║
║ Total -0.031 ║
╠═╡ Ergs (-%) ╞════════════════╡ All M3B3 ╞═╣
║ Best 0.000 ║
║ Worst -2.688 ║
║ Total -0.000 ║
╚═══════════════════════════════════════════╝
╔═╡ Size (-%) ╞════════════════╡ All MzB3 ╞═╗
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╠═╡ Cycles (-%) ╞══════════════╡ All MzB3 ╞═╣
║ Best 0.000 ║
║ Worst -7.235 ║
║ Total -0.028 ║
╠═╡ Ergs (-%) ╞════════════════╡ All MzB3 ╞═╣
║ Best 0.000 ║
║ Worst -2.688 ║
║ Total -0.000 ║
╚═══════════════════════════════════════════╝
╔═╡ Size (-%) ╞═════╡ EVMInterpreter M3B3 ╞═╗
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total NaN ║
╠═╡ Cycles (-%) ╞═══╡ EVMInterpreter M3B3 ╞═╣
║ Best 0.000 ║
║ Worst -7.235 ║
║ Total -4.525 ║
╠═╡ Ergs (-%) ╞═════╡ EVMInterpreter M3B3 ╞═╣
║ Best 0.000 ║
║ Worst -2.688 ║
║ Total -0.642 ║
╠═╡ Ergs/gas ╞══════╡ EVMInterpreter M3B3 ╞═╣
║ ADD 40.750 ║
║ MUL 24.450 ║
║ SUB 40.750 ║
║ DIV 26.850 ║
║ SDIV 42.450 ║
║ MOD 26.850 ║
║ SMOD 40.050 ║
║ ADDMOD 22.156 ║
║ MULMOD 25.156 ║
║ EXP 7.237 ║
║ SIGNEXTEND 25.650 ║
║ LT 44.750 ║
║ GT 44.750 ║
║ SLT 66.750 ║
║ SGT 66.750 ║
║ EQ 44.750 ║
║ ISZERO 38.417 ║
║ AND 40.750 ║
║ OR 40.750 ║
║ XOR 40.750 ║
║ NOT 34.417 ║
║ BYTE 48.750 ║
║ SHL 44.750 ║
║ SHR 44.750 ║
║ SAR 62.750 ║
║ SGT 66.750 ║
║ SHA3 26.922 ║
║ ADDRESS 50.812 ║
║ BALANCE 39.481 ║
║ ORIGIN 1357.750 ║
║ CALLER 50.812 ║
║ CALLVALUE 50.812 ║
║ CALLDATALOAD 36.750 ║
║ CALLDATASIZE 51.125 ║
║ CALLDATACOPY 51.462 ║
║ CODESIZE 51.625 ║
║ CODECOPY 62.456 ║
║ GASPRICE 1354.562 ║
║ EXTCODESIZE 3.732 ║
║ EXTCODECOPY 3.797 ║
║ RETURNDATASIZE 49.500 ║
║ RETURNDATACOPY 44.556 ║
║ EXTCODEHASH 4.926 ║
║ BLOCKHASH 240.719 ║
║ COINBASE 1354.750 ║
║ TIMESTAMP 1348.750 ║
║ NUMBER 1348.750 ║
║ PREVRANDAO 1348.750 ║
║ GASLIMIT 1354.750 ║
║ CHAINID 1342.750 ║
║ SELFBALANCE 641.700 ║
║ BASEFEE 1348.750 ║
║ POP 41.625 ║
║ MLOAD 53.419 ║
║ MSTORE 57.076 ║
║ MSTORE8 66.598 ║
║ SLOAD 20.782 ║
║ SSTORE 4.667 ║
║ JUMP 17.000 ║
║ JUMPI 16.727 ║
║ PC 51.312 ║
║ MSIZE 57.812 ║
║ GAS 48.312 ║
║ JUMPDEST 71.625 ║
║ PUSH0 48.312 ║
║ PUSH1 43.958 ║
║ PUSH2 47.375 ║
║ PUSH4 50.208 ║
║ PUSH5 51.625 ║
║ PUSH6 53.042 ║
║ PUSH7 54.458 ║
║ PUSH8 55.875 ║
║ PUSH9 57.292 ║
║ PUSH10 58.708 ║
║ PUSH11 60.125 ║
║ PUSH12 61.542 ║
║ PUSH13 62.958 ║
║ PUSH14 64.375 ║
║ PUSH15 65.792 ║
║ PUSH16 67.208 ║
║ PUSH17 68.625 ║
║ PUSH18 70.042 ║
║ PUSH19 71.458 ║
║ PUSH20 72.875 ║
║ PUSH21 74.292 ║
║ PUSH22 75.708 ║
║ PUSH23 77.125 ║
║ PUSH24 78.542 ║
║ PUSH25 79.958 ║
║ PUSH26 81.375 ║
║ PUSH27 82.792 ║
║ PUSH28 84.208 ║
║ PUSH29 85.625 ║
║ PUSH30 87.042 ║
║ PUSH31 88.458 ║
║ PUSH32 87.875 ║
║ DUP1 34.417 ║
║ DUP2 40.417 ║
║ DUP3 40.417 ║
║ DUP4 40.417 ║
║ DUP5 40.417 ║
║ DUP6 40.417 ║
║ DUP7 40.417 ║
║ DUP8 40.417 ║
║ DUP9 40.417 ║
║ DUP10 40.417 ║
║ DUP11 40.417 ║
║ DUP12 40.417 ║
║ DUP13 40.417 ║
║ DUP14 40.417 ║
║ DUP15 40.417 ║
║ DUP16 40.417 ║
║ SWAP1 43.083 ║
║ SWAP2 43.083 ║
║ SWAP3 43.083 ║
║ SWAP4 43.083 ║
║ SWAP5 43.083 ║
║ SWAP6 43.083 ║
║ SWAP7 43.083 ║
║ SWAP8 43.083 ║
║ SWAP9 43.083 ║
║ SWAP10 43.083 ║
║ SWAP11 43.083 ║
║ SWAP12 43.083 ║
║ SWAP13 43.083 ║
║ SWAP14 43.083 ║
║ SWAP15 43.083 ║
║ SWAP16 43.083 ║
║ CALL 36.470 ║
║ STATICCALL 35.506 ║
║ DELEGATECALL 34.552 ║
║ CREATE 4.089 ║
║ CREATE2 5.568 ║
║ RETURN 1.000 ║
║ REVERT 1.000 ║
╠═╡ Ergs/gas (-%) ╞═╡ EVMInterpreter M3B3 ╞═╣
║ ADD -5.161 ║
║ MUL -5.161 ║
║ SUB -5.161 ║
║ DIV -4.678 ║
║ SDIV -2.909 ║
║ MOD -4.678 ║
║ SMOD -3.089 ║
║ ADDMOD -7.262 ║
║ MULMOD -6.341 ║
║ EXP -1.401 ║
║ SIGNEXTEND -4.908 ║
║ LT -4.678 ║
║ GT -4.678 ║
║ SLT -3.089 ║
║ SGT -3.089 ║
║ EQ -4.678 ║
║ ISZERO -5.492 ║
║ AND -5.161 ║
║ OR -5.161 ║
║ XOR -5.161 ║
║ NOT -6.170 ║
║ BYTE -4.278 ║
║ SHL -4.678 ║
║ SHR -4.678 ║
║ SAR -3.292 ║
║ SGT -3.089 ║
║ SHA3 -2.265 ║
║ ADDRESS -6.275 ║
║ BALANCE -0.612 ║
║ ORIGIN -0.221 ║
║ CALLER -6.275 ║
║ CALLVALUE -6.275 ║
║ CALLDATALOAD -5.755 ║
║ CALLDATASIZE -6.234 ║
║ CALLDATACOPY -3.979 ║
║ CODESIZE -6.170 ║
║ CODECOPY -3.256 ║
║ GASPRICE -0.222 ║
║ EXTCODESIZE -0.186 ║
║ EXTCODECOPY -0.122 ║
║ RETURNDATASIZE -6.452 ║
║ RETURNDATACOPY -3.085 ║
║ EXTCODEHASH -0.235 ║
║ BLOCKHASH -0.125 ║
║ COINBASE -0.222 ║
║ TIMESTAMP -0.223 ║
║ NUMBER -0.223 ║
║ PREVRANDAO -0.223 ║
║ GASLIMIT -0.222 ║
║ CHAINID -0.224 ║
║ SELFBALANCE -0.187 ║
║ BASEFEE -0.223 ║
║ POP -7.767 ║
║ MLOAD -3.544 ║
║ MSTORE -3.310 ║
║ MSTORE8 -2.909 ║
║ SLOAD -0.357 ║
║ SSTORE -0.164 ║
║ PC -6.210 ║
║ MSIZE -5.473 ║
║ GAS -6.621 ║
║ JUMPDEST -9.143 ║
║ PUSH0 -6.621 ║
║ PUSH1 -4.767 ║
║ DUP2 -10.984 ║
║ DUP3 -10.984 ║
║ DUP4 -10.984 ║
║ DUP5 -10.984 ║
║ DUP6 -10.984 ║
║ DUP7 -10.984 ║
║ DUP8 -10.984 ║
║ DUP9 -10.984 ║
║ DUP10 -10.984 ║
║ DUP11 -10.984 ║
║ DUP12 -10.984 ║
║ DUP13 -10.984 ║
║ DUP14 -10.984 ║
║ DUP15 -10.984 ║
║ DUP16 -10.984 ║
║ SWAP1 -4.868 ║
║ SWAP2 -4.868 ║
║ SWAP3 -4.868 ║
║ SWAP4 -4.868 ║
║ SWAP5 -4.868 ║
║ SWAP6 -4.868 ║
║ SWAP7 -4.868 ║
║ SWAP8 -4.868 ║
║ SWAP9 -4.868 ║
║ SWAP10 -4.868 ║
║ SWAP11 -4.868 ║
║ SWAP12 -4.868 ║
║ SWAP13 -4.868 ║
║ SWAP14 -4.868 ║
║ SWAP15 -4.868 ║
║ SWAP16 -4.868 ║
║ CALL -0.770 ║
║ STATICCALL -0.768 ║
║ DELEGATECALL -0.807 ║
║ CREATE -0.163 ║
║ CREATE2 -0.173 ║
╚═══════════════════════════════════════════╝
╔═╡ Size (-%) ╞═════╡ EVMInterpreter MzB3 ╞═╗
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total NaN ║
╠═╡ Cycles (-%) ╞═══╡ EVMInterpreter MzB3 ╞═╣
║ Best 0.000 ║
║ Worst -7.235 ║
║ Total -4.525 ║
╠═╡ Ergs (-%) ╞═════╡ EVMInterpreter MzB3 ╞═╣
║ Best 0.000 ║
║ Worst -2.688 ║
║ Total -0.642 ║
╚═══════════════════════════════════════════╝
╔═╡ Size (-%) ╞════════╡ Precompiles M3B3 ╞═╗
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╠═╡ Cycles (-%) ╞══════╡ Precompiles M3B3 ╞═╣
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╠═╡ Ergs (-%) ╞════════╡ Precompiles M3B3 ╞═╣
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╚═══════════════════════════════════════════╝
╔═╡ Size (-%) ╞════════╡ Precompiles MzB3 ╞═╗
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╠═╡ Cycles (-%) ╞══════╡ Precompiles MzB3 ╞═╣
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╠═╡ Ergs (-%) ╞════════╡ Precompiles MzB3 ╞═╣
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╚═══════════════════════════════════════════╝
╔═╡ Size (-%) ╞══════════╡ Real life M3B3 ╞═╗
║ Best 0.000 ║
║ Worst -1.061 ║
║ Total -0.069 ║
╠═╡ Cycles (-%) ╞════════╡ Real life M3B3 ╞═╣
║ Best 0.000 ║
║ Worst -0.419 ║
║ Total -0.201 ║
╠═╡ Ergs (-%) ╞══════════╡ Real life M3B3 ╞═╣
║ Best 0.000 ║
║ Worst -0.348 ║
║ Total -0.050 ║
╚═══════════════════════════════════════════╝
╔═╡ Size (-%) ╞══════════╡ Real life MzB3 ╞═╗
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╠═╡ Cycles (-%) ╞════════╡ Real life MzB3 ╞═╣
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╠═╡ Ergs (-%) ╞══════════╡ Real life MzB3 ╞═╣
║ Best 0.000 ║
║ Worst 0.000 ║
║ Total 0.000 ║
╚═══════════════════════════════════════════╝
Based on the benchmarks, shouldn't we keep it enabled by default?
Based on the benchmarks, shouldn't we keep it enabled by default?
Currently, improvements comes from https://github.com/matter-labs/era-compiler-llvm/issues/710 optimization, since it is fortunate that by splitting loop phi live ranges we are generating add instructions from which we are generating indexed load and store instructions. Since this is only intended for EVMInterpreter, I think it is safer to disable it by default and to enable it only for EVMInterpreter. I expect for #710 to bring more improvements for other contracts than we are seeing here.
It's still -0.2% cycles in real-life m3b3 in unintended but understandable manner. Indeed the change we done for EVMInterpreter could result in better use of incremented loads and stores though I agree if to implement #410 the expected effect should be greater. So I'd suggest to attach this patch to #410 and close it for now. When #410 is done, it's worth rechecking the effect of the default. wdyt?
It's still -0.2% cycles in real-life m3b3 in unintended but understandable manner. Indeed the change we done for EVMInterpreter could result in better use of incremented loads and stores though I agree if to implement #410 the expected effect should be greater. So I'd suggest to attach this patch to #410 and close it for now. When #410 is done, it's worth rechecking the effect of the default. wdyt?
Even though it looks like a nice improvement for real-life m3b3, it is just for 1 test:
Group 'Real life M3B3' cycles (-%) worst 100 out of 1:
-0.419 : E+M3B3 0.6.12 tests/solidity/complex/defi/starkex-verifier/test.json::dydx_proof_verification[#fallback:47]
I don't think it is worth to enable it by default, given improvements are only for one test for 0.6.12 + there is a possibility to break indexed loads and stores in other real world contracts. Does this make sense?
Ok, merging.
This optimization should only be used for EVMInterpreter.