Conversion of EVM stack to big integers

varasev commented 5 years ago

I guess it's early to test this from my side, am I right? So, I think, it would be worth to launch the benchmarks after the optimization is complete. Or is there something else I can help with this PR?

vkomenda commented 5 years ago

Maybe you could double-check that cargo bench in ethcore/evm returns better overall timings in commit 967b29e72f6a88ab309d913b8d73b8730ee78562 compared to the latest commit 0d3693aea3d0621722d01900a6894032167457f1.

What remains to be done in this PR are at least two things:

refactor gasometer.rs to have gas measured in Integer instead of U256 so that we save time on type conversion;
refactor the external interface of the EVM: ethcore/vm/src/ext.rs.

I think these two are related. It could be that doing these separately is not possible. Also the external interface is tied to other components which may need to be updated as well.

The effect of the above should be that U256 is completely gone from ethcore/evm/src/interpreter/mod.rs.

phahulin commented 5 years ago

I'll try to fully resync POA Sokol with new binary and see if any errors come up

phahulin commented 5 years ago

POA Core and xDai resynced without errors. In Sokol there was an error on block https://blockscout.com/poa/sokol/blocks/5021292/transactions (there's a single tx in that block).

2019-05-28 13:15:06 UTC Stage 5 block verification failed for #5021292 (0xff1a…258d)
Error: Error(Block(InvalidStateRoot(Mismatch { expected: 0x100be2e8b4d067add33a208b619801c7ba9a6427927612324214a15489b0cc3d, found: 0xf29cd92f062278f2149e8bba4e683b271f4bc73bfba3280ef9a84288a5b60613 })), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })
2019-05-28 13:15:06 UTC 
Bad block detected: Error(Block(InvalidStateRoot(Mismatch { expected: 0x100be2e8b4d067add33a208b619801c7ba9a6427927612324214a15489b0cc3d, found: 0xf29cd92f062278f2149e8bba4e683b271f4bc73bfba3280ef9a84288a5b60613 })), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })
RLP: f902eef9023da014542f792942c71956a9ff20500eeefc8b5dcacc541fa111882a09e22bf9e65ca01dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d4934794d764da536506bd1af6e494a8cdd37959a7387acba0100be2e8b4d067add33a208b619801c7ba9a6427927612324214a15489b0cc3da029ee61e69790b8e34a4b40c00033d6fc7130b6349f2da66f1f79c091885e147fa027ad54967532e5e4c70407115a045b79491e1a1ddfe6c796182324cbc271a90fb901000000000000000000000000000000000000000000010000000000000000000000000000000000000000000400000020000000000000000000000000000000000000000000000000000000000800000000000000000000000000000000000000000000000000000000000000001000000000000000200000000000001000000000000000000000000000000000000000000000000000000008000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000000000000000000000000000000000000080080000000100000000000000000000000000000090fffffffffffffffffffffffffffffffe834c9e6c837a120082d4d7845bc0afac96d583010b088650617269747986312e32372e32826c69841259bcbcb841480895df40eaba0592bed5005f1645310036bde02aa07125c69ca83f6da68ff6189a2e04cdbe1fe089b390a99c0ccf35ae42fb027f4ef29749c26c0824d20bb501f8abf8a944843b9aca0082d620944f9895cfe49b8e648f8bae68519f059dd997a28280b844497d7551000000000000000000000000201406e518abe5e351a8717dd0f419616ca02fbc000000000000000000000000000000000000000000000000000000000000006481bda0151207ea78ce245710f085fa42ecfdc8000ae023f2e05c0598e23cc6bebdd20fa02dd54b69140cac8edd589764a1701410a014c30aa589a5cbae046e0eac7c111cc0
Header: Header { parent_hash: 0x14542f792942c71956a9ff20500eeefc8b5dcacc541fa111882a09e22bf9e65c, timestamp: 1539354540, number: 5021292, author: 0xd764da536506bd1af6e494a8cdd37959a7387acb, transactions_root: 0x29ee61e69790b8e34a4b40c00033d6fc7130b6349f2da66f1f79c091885e147f, uncles_hash: 0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347, extra_data: [213, 131, 1, 11, 8, 134, 80, 97, 114, 105, 116, 121, 134, 49, 46, 50, 55, 46, 50, 130, 108, 105], state_root: 0x100be2e8b4d067add33a208b619801c7ba9a6427927612324214a15489b0cc3d, receipts_root: 0x27ad54967532e5e4c70407115a045b79491e1a1ddfe6c796182324cbc271a90f, log_bloom: 0x00000000000000000000000000000000000000000100000000000000000000000000000000000000000004000000200000000000000000000000000000000000000000000000000000000008000000000000000000000000000000000000000000000000000000000000000010000000000000002000000000000010000000000000000000000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000000000000000000000000000800800000001000000000000000000000000000000, gas_used: 54487, gas_limit: 8000000, difficulty: 340282366920938463463374607431768211454, seal: [[132, 18, 89, 188, 188], [184, 65, 72, 8, 149, 223, 64, 234, 186, 5, 146, 190, 213, 0, 95, 22, 69, 49, 0, 54, 189, 224, 42, 160, 113, 37, 198, 156, 168, 63, 109, 166, 143, 246, 24, 154, 46, 4, 205, 190, 31, 224, 137, 179, 144, 169, 156, 12, 207, 53, 174, 66, 251, 2, 127, 78, 242, 151, 73, 194, 108, 8, 36, 210, 11, 181, 1]], hash: Some(0xff1acc51dab93d33969dc3160c777f6645f997a649b265046830282b4bb0258d) }
Uncles: 
Transactions:[Tx 0] UnverifiedTransaction { unsigned: Transaction { nonce: 68, gas_price: 1000000000, gas: 54816, action: Call(0x4f9895cfe49b8e648f8bae68519f059dd997a282), value: 0, data: [73, 125, 117, 81, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 32, 20, 6, 229, 24, 171, 229, 227, 81, 168, 113, 125, 208, 244, 25, 97, 108, 160, 47, 188, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 100] }, v: 189, r: 9530427700988829602190235950132050505565850277403879439041744793130193768975, s: 20730937074926652319264621547119453737374562879435505647005372538798273728796, hash: 0x5121958fca2cc47adf8ce92b7d3646a9062c6c4b7903c1a38c77a7d4ed5a6dac }

Not sure if it's a real issue or just a bad block, I'll resync one more time.

vkomenda commented 5 years ago

So is it this block on which resync failed? https://blockscout.com/poa/sokol/blocks/5021292/transactions

phahulin commented 5 years ago

Yes, it output the error message and stopped syncing.

varasev commented 5 years ago

@phahulin This version is only for POSDAO (not for POA Networks) because it is based on aura-pos branch, so it is not compatible with POA Core, POA Sokol, Kovan, and xDai. Am I right, @vkomenda ?

phahulin commented 5 years ago

@varasev ah yes, you're right! @vkomenda could you please create a branch with changes of this PR applied on top of upstream stable? Otherwise we won't have any running network to test them.

varasev commented 5 years ago

could you please create a branch with changes of this PR applied on top of upstream stable?

I think, testing these changes makes no sense because they slow down the node's work: https://github.com/poanetwork/parity-ethereum/pull/137#pullrequestreview-239926576

We should test them when we see speed improvements in comparison with the stable branch of upstream.

phahulin commented 5 years ago

Ok, please ping me when it's ready

vkomenda commented 5 years ago

I refactored the gasometer. Benchmarks report performance drop of 2-2.5 times with respect to the current aura-pos. This is in part because I had to move Integer to U256 conversion to the step function of the EVM interpreter when it reports remaining gas. To remove that and similar conversions it is required to refactor the VM submodule of ethcore. Whether that would increase performance back by more than 2.5 times, I'm not sure.

varasev commented 5 years ago

As far as I understand, the main reason for slowness is U256 to Integer and Integer to U256 conversion operations, right?

varasev commented 5 years ago

What if we just do the same for MUL opcode as for the MULMOD in the vk-mulmod branch? I mean instead

instructions::MUL => {
    let a = self.stack.pop_back();
    let b = self.stack.pop_back();
    self.stack.push(a.overflowing_mul(b).0);
},

try

instructions::MUL => {
    let a = self.stack.pop_back();
    let b = self.stack.pop_back();
    let a0 = Integer::from_digits(&a, Order::LsfLe);
    let b0 = Integer::from_digits(&b, Order::LsfLe);
    let r = a0 * b0;
    self.stack.push(
        U256::from_little_endian(r.to_digits::<u8>(Order::LsfLe).as_slice())
    );
},

Did you try to do this for the vk-mulmod branch and launch benchmarks for the MUL operation for different size of the operands a and b? What if it would make sense for the big operands?

varasev commented 5 years ago

Another thought: what if we leave the rug for MULMOD and ADDMOD (because the rug works faster than U256 for these opcodes), but try to use the num_bigint crate for other math opcodes as Parity team did for the MULMOD here: https://github.com/paritytech/parity-ethereum/pull/10642/files#diff-f24406c9371a7a621c14519c77ce4bdd

For all these tries we need to launch benchmarks to see whether it makes sense and for which size of operands.

vkomenda commented 5 years ago

As far as I understand, the main reason for slowness is U256 to Integer and Integer to U256 conversion operations, right?

Yes. The places where the conversions happen matter. If any appear in the interpreter, it will run slower than if there weren't any conversions there. The place where conversions happen make difference to performance. Having conversions inside simple arithmetic opcode handlers (not complex like MULMOD) is enough to make these opcodes slower than the old U256 version. Currently though there are no conversions inside opcode handlers but there are some when the interpreter interacts with the VM. This interaction, as it happens, when the interpreter steps through an opcode. So, essentially we still do have conversions at every opcode. When the VM is refactored, we should see less conversions and performance should then increase compared to aura-pos and the upstream version.

vkomenda commented 5 years ago

I converted the VM in 967b29e72f6a88ab309d913b8d73b8730ee78562 and performance regressed even more. Now it's 2.1-3.1 times slower than in aura-pos. Using Integer everywhere turns out to be a naive solution. The upstream version has a better integrated solution with no dependency on rug and only MULMOD and ADDMOD being up to 2 times slower in uncommon cases. If we don't need further optimisation, I vote to adopt the upstream solution. In the other case, we need u64-valued counterparts of some methods that are now Integer-valued. Also we should have an EVM interpreter instance that performs internal gas computations using u64 instead of big integers. Instead of more instances we could try defining and using a type that wraps either an Integer or u64 and copy if the wrapped value is a u64.

varasev commented 5 years ago

The upstream version has a better integrated solution with no dependency on rug and only MULMOD and ADDMOD being up to 2 times slower in uncommon cases.

I'd propose to leave our approach for MULMOD/ADDMOD with the rug, but try to use the upstream's num_bigint solution with other math opcodes as I suggested in the https://github.com/poanetwork/parity-ethereum/pull/137#issuecomment-497756931. Let's first do it for one opcode (e.g., MUL) and compare its performance with the U256. And then try to do that for the rest opcodes.

Also, I'd propose to benchmark this one https://github.com/poanetwork/parity-ethereum/pull/137#issuecomment-497754187 and see if it is really slower than the standard U256.

vkomenda commented 5 years ago

I'd propose to leave our approach for MULMOD/ADDMOD with the rug, but try to use the upstream's num_bigint solution with other math opcodes as I suggested in the #137 (comment)

I suggested not using rug because the dependency on it introduces problems with the Windows build. MULMOD and ADDMOD are only 1.2 times slower with num_bigint compared to rug in the case of small moduli. With large moduli that ratio increases to 2 but is still better than 10 compared to the old version.

Also, I'd propose to benchmark this one #137 (comment) and see if it is really slower than the standard U256.

Isn't this what I did in earlier commits? I used to convert from U256 inside opcode handlers. That was slower than the upstream version.

varasev commented 5 years ago

Ok, that would be interesting to compare num_bigint approach for all the math opcodes with an old upstream version.

Also, the solution with improving the ethereum-types as @afck suggested could be implemented. I mean, we should try and benchmark different scenarios and choose the fastest one.

Isn't this what I did in earlier commits? I used to convert from U256 inside opcode handlers. That was slower than the upstream version.

Yeah, I just didn't see the benchmark results for that comparison (or don't remember where they are in this repo). Maybe I'll try to compare that myself when I have time.

vkomenda commented 5 years ago

I just didn't see the benchmark results for that comparison (or don't remember where they are in this repo)

I think you did benchmark that commit, 967b29e72f6a88ab309d913b8d73b8730ee78562. I used a function apply_rug_truncated_2 which did the conversion and computation.

ordian commented 5 years ago

FYI, the division algorithm in uint was improved (https://github.com/paritytech/parity-common/pull/126) and integrated into parity-ethereum 2.6 (now in beta).

varasev commented 5 years ago

@ordian Thank you for the info!

vkomenda commented 5 years ago

Closing in favour of #179.

poanetwork / parity-ethereum

Conversion of EVM stack to big integers #137