Closed chriseth closed 3 years ago
This is a great idea. I did some research on the topic and I saw that LLVM uses a variant of literate programming in their test suite that looks pretty good.
They embed special comments within the source that tell the compiler how they should test the file and which assertions they should enforce. While we could combine arbitrary RUN pragmas with a tool like seth to execute them, a good middle-of-the-road solution is to start with a CALL assertion for what you describe above and progressively add support for things like EXPECT_WARN, EXPECT_ERROR, etc.
// test_name.sol
// CALL: f(uint,bytes32): 0x123000, 456 -> 123, true
// CALL: g(string): "abc" -> X
contract test {
// ...
}
This way test sources remain compilable standalone and the format has more flexibility to grow.
Update: the second CALL could use a different pragma to be more clear, like REVERT.
Hm, I actually did not think that far, but of course we could use one file per test. This would also make it easier to parallelize tests with tools like circleci.
If the expectations and "commands" are in comments, each single file can be compiled easily as a first check, even without a test runner.
We could even define things like
// DEPLOY: C(constructorArg1, constructoArg2)
// EXPECT_LOG: ...
The 99% use case of just deploying the first (or last?) contract in the file and running function on that address should not need an explicit DEPLOY command, though.
I'm wondering if we get too many files that way or if it would be too hard to properly find test cases.
Yes, agree on implicitly deploying the last contract if it has a zero-argument constructor (usually when defining an inheritance hierarchy it's the last one that you want to test).
I think the fuzzy matching built into many tools makes it easier to find test cases spread across multiple files than what we have right now. You can see an example of how it looks in these tests I extracted a while ago.
Are you just interested in this or would you even like to work on it?
I really like that idea! But I think that I probably didn't fully understand what you really mean with "extraction" here. After reading I thought about integrated unit tests. So additionally to the contract code, executable test-code can be defined, that is able to interact with the corresponding contract code. Probably you just mean that somehow current tests could be extracted to such testing environment - so we will save compilation time.
I would somehow try to create a test execution framework that is able to parse the specific test comments, execute and compare the actual results with expectations. I could think of a special test mode in solc
that is able to execute such integrated tests and interact with test-nodes via rpc.
What do you think? Did you mean such thing?
@chriseth now I understood the extraction part - I just saw test/libsolidity/SolidityEndToEndTest.cpp
with over 10.000 lines of code :D - I think that I have a nice idea for that - I will let you know.
@chriseth I'm interested, but I won't have time to work on it at the moment.
@ehildenb uses pandoc to extract code from Markdown formatted files in the k-framework (kevm, kwasm). It may be a good option to use that since we could have more detailed descriptions around the test cases.
@ehildenb do you have a look to the pandoc script?
I would really be careful not to overcomplicate this. Do we really need detailed descriptions formatted in html? I agree that we need a way to show descriptions, but just non-formatted comments should be more than fine. We don't need latex formulas or other stuff like kevm might need.
@chriseth the descriptions are just plain text (markdown, or any format supported by pandoc
).
I don't think I have enough experience with the test-case filling to speak about whether pandoc-tangle
is the correct fit (which is what we use for KEVM/KWASM), but I can at least explain what's possible with pandoc-tangle
which might help out here.
First, it allows having text-explanation around the code (which then renders nicely in Github by default, Github understands Markdown). This is nice for repository presentation (which is one of the main reasons I like it for the K* repos which are supposed to be human-readable semantics).
Second, it allows "tangling" the source document to produce the target document. For example, there may be a chunk of code you want to show up as a prefix to all the generated files, and then have other specific code-blocks be included after the generic part. This is a fairly straight-forward one-liner with pandoc-tangle
.
To clarify confusion, pandoc
is the program that does most of the heavy lifting, but it accepts Lua-script configurations as the "target" writer which direct how to generate the output. pandoc-tangle
is a Lua-script that does the "tangling" and lets you specify which code-blocks to include in the output of each document.
Anyway, let me know if you want more information on it/think it might be a good fit, I can help to get it setup.
I'm with @chriseth on this. Non-formatted comments should be more than enough for this use case, and makes it easier to point solc
or any other tool to the file.
I currently implementing a generic test framework for solidity. A first functional prototype will be ready during the next days. Basically I wrote a small interpreter for a subset of solidity
. It's a small scripting language that take advantage of all the stuff already in-place: code checking, parsing & creation of the AST. At first I thought it would be nice to have a test-mode inside solc
- but after writing reporting classes I realised it would make more sense to just use boost::test
. So we can nicely generate standard xunit-test xml
reports.
So basically it is possible to define for a contract Contract
, a file that is named Contract.soltest
. The structure is like that:
{test case name}
<SOLTEST-CODE>
...
{test case name}
<SOLTEST-CODE>
Currently it supports simple assignments & arithmetic expressions, and assertions with assert
.
{simple test}
uint32 one = 1;
uint32 two = 2;
uint32 three = two + one;
assert(1 + three == 1 + one + three - 1);
assert(true);
{another test}
assert(true);
Hopefully during the next days, it will be possible to really interact with running nodes: create contract instances and interact with contract functions:
{simple test}
Contract contract = new Contract(2);
assert(contract.doSomething(1) == true);
End-to-end tests can be executed with test/soltest --run_test=EndToEnd -- --ipcpath /tmp/ipc test/scripting/contracts/SimpleAssignment.sol
. soltest
will basically iterate over all defined .sol
files, extract the defined contracts, and will search for files that have the name of the <contract name>+ ".soltest"
, and will finally execute all the defined test-cases.
Internally it will generate a test contract, just for the AST creation. The interpreter will then just walk through the AST, where it will take care of the interaction with the real test node - also that code seem to be already in place (test/libsolidity/SolidityExecutionFramework.h
).
Per soltest-testcase
a boost::test
test-case will be generated (at runtime) during the code analysis of the soltest
file. Additionally to that boost::test
test-cases are generated for the parse
, analyze
and compile
steps of the contracts that need to be tested. If assertions fail, the framework is able to find out the correct location in the soltest
file.
I could imagine, that it would also be very nice to add a new natspec tag
for testing:
pragma solidity ^0.4.0;
/// @test {setup}
/// SimpleAssignment s = new SimpleAssignment(42);
/// @test {check add}
/// assert(s.add(1, 2) == 3);
/// assert(s.add(2, 3) == 5);
contract SimpleAssignment {
function SimpleAssignment(uint a) public {
}
function add(uint a, uint b) public pure returns (uint sum) {
return a + b;
}
}
There are also two special "test-cases", setup
& teardown
: setup
code will be just prepended to the test-case functions, where teardown
will be appended.
Additionally to that, the framework will provide a soltest
object, that can be used to setup specific test scenarios.
pragma solidity ^0.4.0;
/// @test {setup}
/// soltest.setChainParams(..);
/// soltest.mineBlocks(..);
/// soltest.modifyTimestamp(..);
/// soltest.addBlock(..);
/// soltest.rewindToBlock(..);
/// @test {check add}
/// SimpleAssignment s = new SimpleAssignment(42)
/// assert(s.add(1, 2) == 3);
/// assert(s.add(2, 3) == 5);
contract SimpleAssignment {
function SimpleAssignment(uint a) public {
}
function add(uint a, uint b) public pure returns (uint sum) {
return a + b;
}
}
As far as I know, setChainParams
, mineBlocks
, modifyTimestamp
, addBlock
and rewindToBlock
are only implemented in eth
, where geth
and parity
providing different functions for testing. I could imagine, that the soltest
object could provide a standardised interface to enable access to specific testing functions of different node implementations.
First code can be found here: https://github.com/aarlt/solidity/tree/soltest-interpreter.
Its the first code that I wrote that really interacts with internal solidity apis - so there is a lot that need to be improved - so just ignore all the hacky hacks :)
What are your opinions?
No opinions?
I just finished a very basic prototype. It can now interact with contracts on chain - currently i use geth
v1.8.1-stable in development mode with geth --dev --rpcapi=personal
. I also added a new soltest
method setAccount(string)
that can be used to simulate contract calls from different users. These accounts will automatically preloaded with 1 ether from the "master" account during the first call of soltest.setAccount(..)
. The "master" account is eth.accounts[0]
.
I recorded a small demo that is somehow trying to capture the features currently implemented.
The next thing that I want to do is to implement @test
natspec support - so only one file is needed - the solidity file with @test
natspecs.
It would be nice to get some feedback.
@chriseth @axic What do you think?
Current code can be found here: https://github.com/aarlt/solidity/tree/soltest-interpreter
I guess the SimpleOwner.sol
file was not shown long enough, so here it is:
pragma solidity ^0.4.0;
contract SimpleOwner {
address owner;
function SimpleOwner() public {
owner = msg.sender;
}
function add(uint16 a, uint16 b) public returns (uint16 sum) {
if (msg.sender == owner) {
return a + b;
} else {
return 0;
}
}
}
I think it's an interesting tool for some use cases but I would try to avoid overengineering this at all costs.
Test cases should be readable with a plain Solidity parser without any preprocessing. NatSpec was not designed for this use case and would probably add unnecessary clutter to the main codebase. Instead, I propose a minimally invasive test vocabulary using prefixed comments that would are only expected to be understood by the test executor and can be easily ignored by the rest of the tools.
@aarlt what you are working on in certainly very useful, but I would say it is separate from this issue here. One idea behind extracting the test cases and expectations is that you can update the expectations automatically. Also, the test framework should be as non-invasive as possible to be sure that the correct piece of software is tested.
@federicobond @chriseth Thank you very much for your feedback!
@chriseth Ok, looks like that I didn't understood the idea, especially not the part of updating expectations automatically. I thought somehow it's possible to cover these tests with such framework. However, the whole testing thing is really the perfect vehicle to learn some stuff in more detail. So thank you for the inspiration :)
I forgot this issue was still open, because of the two previous big steps (#3644, #4223). There are still certain features missing: #10426 and #10427, at least.
Tracking in #12253.
Often a big source of annoyance is that recompiling the gigantic test source files we have takes quite a bit of time. Most of the tests, though, have a very similar structure:
In the EndToEnd-tests, a contract is given, it is compiled, deployed and then multiple functions are called on the contract with multiple arguments, yielding certain expected outputs.
In the NameAndType tests, a contract is given, is compiled until the type checking phase and then an expected list of warnings and errors (or none) is checked.
Both types do not really need to be .cpp files with their own logic. Most of the tests can be specified by just a list of strings. If this list is an external file, no recompilation is needed if just the test expectations are adjusted.
We can even go further than that: We might have an interactive test runner, that just asks the user to automatically correct test expectations if they fail, on a test-by-test basis: It displays the source, the inptus (for the case of EndToEnd tests), the actual values and the expected values and waits for a y/n response to adjust the values.
The only problem here might be the encoding of the inputs and outputs of the EndToEnd tests. For readability reasons, we do not want them to be fully hex encoded, so the file format has to be able to support some kind of flexibility there. We might start with an easy version, just supporting decimals and hex numbers (if auto-generated, we might want to check if the hex version ends with many zeros or
f
s and only then choose hex) and extract all test cases that have such simple inputs and outputs.I would propose to use a simple separator-based expectation file format (not yaml or json because it could create problems with escaping and also indentation is always weird):
The
X
signifies that arevert
is expected.