ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.35k stars 5.77k forks source link

Extract test cases (from SoldityEndToEndTest) #3486

Closed chriseth closed 3 years ago

chriseth commented 6 years ago

Often a big source of annoyance is that recompiling the gigantic test source files we have takes quite a bit of time. Most of the tests, though, have a very similar structure:

In the EndToEnd-tests, a contract is given, it is compiled, deployed and then multiple functions are called on the contract with multiple arguments, yielding certain expected outputs.

In the NameAndType tests, a contract is given, is compiled until the type checking phase and then an expected list of warnings and errors (or none) is checked.

Both types do not really need to be .cpp files with their own logic. Most of the tests can be specified by just a list of strings. If this list is an external file, no recompilation is needed if just the test expectations are adjusted.

We can even go further than that: We might have an interactive test runner, that just asks the user to automatically correct test expectations if they fail, on a test-by-test basis: It displays the source, the inptus (for the case of EndToEnd tests), the actual values and the expected values and waits for a y/n response to adjust the values.

The only problem here might be the encoding of the inputs and outputs of the EndToEnd tests. For readability reasons, we do not want them to be fully hex encoded, so the file format has to be able to support some kind of flexibility there. We might start with an easy version, just supporting decimals and hex numbers (if auto-generated, we might want to check if the hex version ends with many zeros or fs and only then choose hex) and extract all test cases that have such simple inputs and outputs.

I would propose to use a simple separator-based expectation file format (not yaml or json because it could create problems with escaping and also indentation is always weird):

TestName
contract {
 // source until separator
}
=====
f(uint,bytes32): 0x123000, 456 -> 123, true
g(string): "abc" -> X
=====
NextTest
// ...

The X signifies that a revert is expected.

federicobond commented 6 years ago

This is a great idea. I did some research on the topic and I saw that LLVM uses a variant of literate programming in their test suite that looks pretty good.

They embed special comments within the source that tell the compiler how they should test the file and which assertions they should enforce. While we could combine arbitrary RUN pragmas with a tool like seth to execute them, a good middle-of-the-road solution is to start with a CALL assertion for what you describe above and progressively add support for things like EXPECT_WARN, EXPECT_ERROR, etc.

// test_name.sol
// CALL: f(uint,bytes32): 0x123000, 456 -> 123, true
// CALL: g(string): "abc" -> X

contract test {
 // ...
}

This way test sources remain compilable standalone and the format has more flexibility to grow.

Update: the second CALL could use a different pragma to be more clear, like REVERT.

chriseth commented 6 years ago

Hm, I actually did not think that far, but of course we could use one file per test. This would also make it easier to parallelize tests with tools like circleci.

If the expectations and "commands" are in comments, each single file can be compiled easily as a first check, even without a test runner.

We could even define things like

// DEPLOY: C(constructorArg1, constructoArg2)
// EXPECT_LOG: ...

The 99% use case of just deploying the first (or last?) contract in the file and running function on that address should not need an explicit DEPLOY command, though.

I'm wondering if we get too many files that way or if it would be too hard to properly find test cases.

federicobond commented 6 years ago

Yes, agree on implicitly deploying the last contract if it has a zero-argument constructor (usually when defining an inheritance hierarchy it's the last one that you want to test).

I think the fuzzy matching built into many tools makes it easier to find test cases spread across multiple files than what we have right now. You can see an example of how it looks in these tests I extracted a while ago.

chriseth commented 6 years ago

Are you just interested in this or would you even like to work on it?

aarlt commented 6 years ago

I really like that idea! But I think that I probably didn't fully understand what you really mean with "extraction" here. After reading I thought about integrated unit tests. So additionally to the contract code, executable test-code can be defined, that is able to interact with the corresponding contract code. Probably you just mean that somehow current tests could be extracted to such testing environment - so we will save compilation time.

I would somehow try to create a test execution framework that is able to parse the specific test comments, execute and compare the actual results with expectations. I could think of a special test mode in solc that is able to execute such integrated tests and interact with test-nodes via rpc.

What do you think? Did you mean such thing?

aarlt commented 6 years ago

@chriseth now I understood the extraction part - I just saw test/libsolidity/SolidityEndToEndTest.cpp with over 10.000 lines of code :D - I think that I have a nice idea for that - I will let you know.

federicobond commented 6 years ago

@chriseth I'm interested, but I won't have time to work on it at the moment.

axic commented 6 years ago

@ehildenb uses pandoc to extract code from Markdown formatted files in the k-framework (kevm, kwasm). It may be a good option to use that since we could have more detailed descriptions around the test cases.

@ehildenb do you have a look to the pandoc script?

chriseth commented 6 years ago

I would really be careful not to overcomplicate this. Do we really need detailed descriptions formatted in html? I agree that we need a way to show descriptions, but just non-formatted comments should be more than fine. We don't need latex formulas or other stuff like kevm might need.

ehildenb commented 6 years ago

@chriseth the descriptions are just plain text (markdown, or any format supported by pandoc).

I don't think I have enough experience with the test-case filling to speak about whether pandoc-tangle is the correct fit (which is what we use for KEVM/KWASM), but I can at least explain what's possible with pandoc-tangle which might help out here.

First, it allows having text-explanation around the code (which then renders nicely in Github by default, Github understands Markdown). This is nice for repository presentation (which is one of the main reasons I like it for the K* repos which are supposed to be human-readable semantics).

Second, it allows "tangling" the source document to produce the target document. For example, there may be a chunk of code you want to show up as a prefix to all the generated files, and then have other specific code-blocks be included after the generic part. This is a fairly straight-forward one-liner with pandoc-tangle.

To clarify confusion, pandoc is the program that does most of the heavy lifting, but it accepts Lua-script configurations as the "target" writer which direct how to generate the output. pandoc-tangle is a Lua-script that does the "tangling" and lets you specify which code-blocks to include in the output of each document.

Anyway, let me know if you want more information on it/think it might be a good fit, I can help to get it setup.

federicobond commented 6 years ago

I'm with @chriseth on this. Non-formatted comments should be more than enough for this use case, and makes it easier to point solc or any other tool to the file.

aarlt commented 6 years ago

I currently implementing a generic test framework for solidity. A first functional prototype will be ready during the next days. Basically I wrote a small interpreter for a subset of solidity. It's a small scripting language that take advantage of all the stuff already in-place: code checking, parsing & creation of the AST. At first I thought it would be nice to have a test-mode inside solc - but after writing reporting classes I realised it would make more sense to just use boost::test. So we can nicely generate standard xunit-test xml reports.

So basically it is possible to define for a contract Contract, a file that is named Contract.soltest. The structure is like that:

{test case name}
    <SOLTEST-CODE>
...
{test case name}
    <SOLTEST-CODE>

Currently it supports simple assignments & arithmetic expressions, and assertions with assert.

{simple test}
    uint32 one = 1;
    uint32 two = 2;
    uint32 three = two + one;
    assert(1 + three == 1 + one + three - 1);
    assert(true);

{another test}
   assert(true);

Hopefully during the next days, it will be possible to really interact with running nodes: create contract instances and interact with contract functions:

{simple test}
    Contract contract = new Contract(2);
    assert(contract.doSomething(1) == true);

End-to-end tests can be executed with test/soltest --run_test=EndToEnd -- --ipcpath /tmp/ipc test/scripting/contracts/SimpleAssignment.sol. soltest will basically iterate over all defined .sol files, extract the defined contracts, and will search for files that have the name of the <contract name>+ ".soltest", and will finally execute all the defined test-cases.

Internally it will generate a test contract, just for the AST creation. The interpreter will then just walk through the AST, where it will take care of the interaction with the real test node - also that code seem to be already in place (test/libsolidity/SolidityExecutionFramework.h).

Per soltest-testcase a boost::test test-case will be generated (at runtime) during the code analysis of the soltest file. Additionally to that boost::test test-cases are generated for the parse, analyze and compile steps of the contracts that need to be tested. If assertions fail, the framework is able to find out the correct location in the soltest file.

I could imagine, that it would also be very nice to add a new natspec tag for testing:

pragma solidity ^0.4.0;
/// @test {setup}
///      SimpleAssignment s = new SimpleAssignment(42);
/// @test {check add}
///     assert(s.add(1, 2) == 3);
///     assert(s.add(2, 3) == 5);
contract SimpleAssignment {
    function SimpleAssignment(uint a) public {
    }
    function add(uint a, uint b) public pure returns (uint sum)  {
        return a + b;
    }
}

There are also two special "test-cases", setup & teardown: setup code will be just prepended to the test-case functions, where teardown will be appended.

Additionally to that, the framework will provide a soltest object, that can be used to setup specific test scenarios.

pragma solidity ^0.4.0;
/// @test {setup}
///      soltest.setChainParams(..);
///      soltest.mineBlocks(..);
///      soltest.modifyTimestamp(..);
///      soltest.addBlock(..);
///      soltest.rewindToBlock(..);
/// @test {check add}
///     SimpleAssignment s = new SimpleAssignment(42)
///     assert(s.add(1, 2) == 3);
///     assert(s.add(2, 3) == 5);
contract SimpleAssignment {
    function SimpleAssignment(uint a) public {
    }
    function add(uint a, uint b) public pure returns (uint sum)  {
        return a + b;
    }
}

As far as I know, setChainParams, mineBlocks, modifyTimestamp, addBlock and rewindToBlock are only implemented in eth, where geth and parity providing different functions for testing. I could imagine, that the soltest object could provide a standardised interface to enable access to specific testing functions of different node implementations.

First code can be found here: https://github.com/aarlt/solidity/tree/soltest-interpreter.

Its the first code that I wrote that really interacts with internal solidity apis - so there is a lot that need to be improved - so just ignore all the hacky hacks :)

What are your opinions?

aarlt commented 6 years ago

No opinions?

I just finished a very basic prototype. It can now interact with contracts on chain - currently i use geth v1.8.1-stable in development mode with geth --dev --rpcapi=personal. I also added a new soltest method setAccount(string) that can be used to simulate contract calls from different users. These accounts will automatically preloaded with 1 ether from the "master" account during the first call of soltest.setAccount(..). The "master" account is eth.accounts[0].

I recorded a small demo that is somehow trying to capture the features currently implemented.

Alt Text

The next thing that I want to do is to implement @test natspec support - so only one file is needed - the solidity file with @test natspecs.

It would be nice to get some feedback.

@chriseth @axic What do you think?

Current code can be found here: https://github.com/aarlt/solidity/tree/soltest-interpreter

I guess the SimpleOwner.sol file was not shown long enough, so here it is:

pragma solidity ^0.4.0;

contract SimpleOwner {
    address owner;
    function SimpleOwner() public {
        owner = msg.sender;
    }
    function add(uint16 a, uint16 b) public returns (uint16 sum)  {
        if (msg.sender == owner) {
            return a + b;
        } else {
            return 0;
        }
    }
}
federicobond commented 6 years ago

I think it's an interesting tool for some use cases but I would try to avoid overengineering this at all costs.

Test cases should be readable with a plain Solidity parser without any preprocessing. NatSpec was not designed for this use case and would probably add unnecessary clutter to the main codebase. Instead, I propose a minimally invasive test vocabulary using prefixed comments that would are only expected to be understood by the test executor and can be easily ignored by the rest of the tools.

chriseth commented 6 years ago

@aarlt what you are working on in certainly very useful, but I would say it is separate from this issue here. One idea behind extracting the test cases and expectations is that you can update the expectations automatically. Also, the test framework should be as non-invasive as possible to be sure that the correct piece of software is tested.

aarlt commented 6 years ago

@federicobond @chriseth Thank you very much for your feedback!

@chriseth Ok, looks like that I didn't understood the idea, especially not the part of updating expectations automatically. I thought somehow it's possible to cover these tests with such framework. However, the whole testing thing is really the perfect vehicle to learn some stuff in more detail. So thank you for the inspiration :)

axic commented 3 years ago

I forgot this issue was still open, because of the two previous big steps (#3644, #4223). There are still certain features missing: #10426 and #10427, at least.

axic commented 3 years ago

Tracking in #12253.