Create a benchmark suite

ethereum / fe

Emerging smart contract language for the Ethereum blockchain.

https://fe-lang.org

Other

1.61k stars 187 forks source link

Create a benchmark suite #158

Open cburgdorf opened 3 years ago

cburgdorf commented 3 years ago

What is wrong?

We currently don't have a way to measure the efficiency of Fe code. It would be nice to be able to create contracts with certain use cases and measure:

Deployment gas cost
Runtime gas cost of various actions
Code size
?

The numbers above could be measured against raw YUL and optimized YUL code.

We can then also compare Fe against other established languages to see where we stand in terms of efficiency.

How can it be fixed

This can start dead simple by creating the tests and figuring out how to measure the gas costs and printing them out. We can take the Py-EVM benchmark suite as a source of inspiration (even though it measures different things, namely the speed of Py-EVM against a fixed set of gas costs whereas we want to measure and improve the gas costs itself)

satyamakgec commented 3 years ago

It will be a nice start, I have some small queries about the outcomes.

When we say deployment gas cost isn't it based on the no. of opcodes we call during the deployment of the contract and the contract size itself. Then maybe the real difference is how much raw YUL we generated for it in comparison to the solc Yul generation for the same? Or I am totally wrong here.

For figuring out the gas cost part can't we calculate it using some existing solidity-based gas estimation libraries which will take binary or YUL as input?

cburgdorf commented 3 years ago

isn't it based on the no. of opcodes we call during the deployment of the contract and the contract size itself

That's right.

Then maybe the real difference is how much raw YUL we generated for it in comparison to the solc Yul generation for the same?

I think that, because it all boils down to the size of the bytecode and the called opcodes, we should probably not worry too much about the generated YUL. I think the fairest comparison would be too have some code in solidity (and vyper) that serves as a reference and where the implementations are as close to each other as possible and then just measure and compare the gas cost for deployment and operations.

We currently use rust-evm to run the tests and I think we can continue to use that for the benchmarks, too. As long as we get the receipt for the transaction we should be able to read gasUsed from the receipt. This should work for contract deployment as well as any other operations that we perform on the contract.

g-r-a-n-t commented 3 years ago

I think we could do something as simple as adding a map field to ContractHarness for this. When the contract is deployed or whenever a function is called, we would add an entry to record the gas used.

The contents of these maps could be written out to tracked files (bench/erc20_token.yaml for example) and would contain the function called, what parameters where provided, and the gas usage. It would be nice to see the diffs on these files in each PR.

cburgdorf commented 3 years ago

When we do the contract differential fuzzing we could also record the used gas for solidity / fe. We could then go forward and configure a threshold for a maximum gas premium that the Fe code is allowed to consume (e.g. --max-percentage-gas-premium 20) and then every time the fuzzer finds a call that is taking up too much gas (above the threshold) we would report, that too. This seems to me to be an efficient double benefit for the fuzzer. :upside_down_face: