feature request: validate that test architecture matches what's expected

mds1 commented 10 months ago

Currently bulloak lets you go from spec (.tree file) to tests, and a really valuable feature would be to verify that the test architecture matches what's expected based on the tests. This is important to ensure that the spec stays up to date as tests change, for example by running bulloak check in CI which would exit with an error if they don't match.

Initial thoughts on what's considered a match:

The tree file and test file should be in the same directory
Their filenames can be anything as long as they match, e.g. MyContract.foo.tree and MyContract.foo.t.sol are the spec and test for the foo method.
All modifier names and test names must be present
The ordering of all modifiers and tests must match what's in the tree file
Modifier and test function bodies do not need to match what bulloak generates (solidity verifies that modifier's have the _;)
Any number of helper functions (functions that don't start with test, invariant, or statefulFuzz) are allowed in any order

One open question is how to handle fuzz/invariant tests here. Bulloak only generates concrete tests which I agree with. Then what I like to do is generate fuzz tests when applicable. For example given this spec:

foo.sol
 └── when stuff called
    └── it should revert

bulloak would generate:

pragma solidity [VERSION];

contract FooTest {
  modifier whenStuffCalled() {
    _;
  }

  function test_RevertWhen_StuffCalled() external whenStuffCalled {
    // it should revert
  }
}

And it's useful to be able to modify this to:

pragma solidity [VERSION];

contract FooTest {
  modifier whenStuffCalled() {
    _;
  }

  function test_RevertWhen_StuffCalled() external whenStuffCalled
  {
    // The concrete test calls the fuzz test with a hardcoded value
    test_RevertWhen_StuffCalled(100);
  }

  function testFuzz_RevertWhen_StuffCalled(uint256 x) external whenStuffCalled
  {
    // then the test logic goes here
  }
}

So perhaps the rule here should be that for fuzz tests, we require that the fuzz test name matches one of the bulloak generated names with the exception of the added Fuzz prefix, per the foundry best practices.

Stateful fuzz (invariant) tests are a bit more complex, and perhaps should have their own spec file? Curious to get @PaulRBerg thoughts here too

alexfertel commented 10 months ago

and a really valuable feature would be to verify that the test architecture matches what's expected based on the tests.

Yes! I agree with you. I noticed that there was a test with no corresponding .tree branch the other day (https://github.com/alexfertel/bulloak/discussions/7#discussioncomment-6804614), which made me think that bulloak might help here as well.

Their filenames can be anything as long as they match, e.g. MyContract.foo.tree and MyContract.foo.t.sol are the spec and test for the foo method.

I think this is a bit more nuanced. Currently bulloak emits to the filename that's at the top of the tree. For this tree

foo.sol
 └── when stuff called
    └── it should revert

The contract would be written to a file foo.sol in the same directory. I think we could probably be more strict here, wdyt?

All modifier names and test names must be present

This might be too strict. Consider the following tree, which is an excerpt from a tree from Sablier:

deep.sol
├── when stuff called
│  └── it should revert
└── when not stuff called
   ├── when the deposit amount is zero
   │  └── it should revert
   └── when the deposit amount is not zero
      ├── when the number count is zero
      │  └── it should revert
      ├── when the asset is not a contract
      │  └── it should revert
      └── given the asset is a contract
          ├── when the asset misses the ERC-20 return value
          │  ├── it should create the child
          │  ├── it should perform the ERC-20 transfers
          │  └── it should emit a {MultipleChildren} event
          └── when the asset does not miss the ERC-20 return value
              ├── it should create the child
              └── it should emit a {MultipleChildren} event

For the when the asset does not miss the ERC-20 return value branch, the emitted test name is rather verbose. A solution might be to introduce new syntax for test names, like what #3 proposes, however I wonder if the .tree file won't get too dense (Since the purpose of the tree in the first place is to clearly convey the spec.

Another problem with this is that we would be constraining what people can write in the condition. Since they have to write characters that can be converted to valid solidity identifiers, a branch like when it emitted a {MultipleChildren} event cannot exist in the spec.

Otherwise this looks good to me! I'm happy to add this, and indeed bulloak check seems like a good name 👍🏻 .

One open question is how to handle fuzz/invariant tests here. Bulloak only generates concrete tests which I agree with. Then what I like to do is generate fuzz tests when applicable.

Hmmm, this might be a good topic for discussion, so perhaps we should open one for it?

mds1 commented 10 months ago

I think this is a bit more nuanced. Currently bulloak emits to the filename that's at the top of the tree. For this tree ...snip... The contract would be written to a file foo.sol in the same directory. I think we could probably be more strict here, wdyt?

Hmm, I do think implementing #3 first is probably the way to go, as that might inform the rules here a bit (e.g. around whether you have one or multiple specs in a single .tree file)

Also, tangential, but -w feature seems to not work for me? I create test/deep.tree with those contents and got

bulloak -w ./**/*.tree     
WARN: Skipped emitting to "./test/deep.tree".
      The file "./test/deep.sol" already exists.

but ./test/deep.sol does not exist.

For the when the asset does not miss the ERC-20 return value branch, the emitted test name is rather verbose.

Personally I think this is ok and I don't mind the long test names, and this one doesn't seem too bad:

function test_WhenTheAssetDoesNotMissTheERC_20ReturnValue()
  external
  whenNotStuffCalled
  whenTheDepositAmountIsNotZero
  givenTheAssetIsAContract
  whenTheAssetDoesNotMissTheERC_20ReturnValue
{
  // it should create the child
  // it should emit a {MultipleChildren} event
}

I agree with your concern about adding to much info to the tre would hinder readability, so perhaps worth leaving the long test names for now?

Another problem with this is that we would be constraining what people can write in the condition. Since they have to write characters that can be converted to valid solidity identifiers, a branch like when it emitted a {MultipleChildren} event cannot exist in the spec.

This seems ok to me, as unsupported characters can just be stripped or converted, e.g. I think stripping out brackets is reasonable.

alexfertel commented 10 months ago

Also, tangential, but -w feature seems to not work for me? I create test/deep.tree with those contents and got

Oh? That sounds like a bug. I'll check it out.

Personally I think this is ok and I don't mind the long test names, and this one doesn't seem too bad:

I agree and I think we can be opinionated here.

This seems ok to me, as unsupported characters can just be stripped or converted, e.g. I think stripping out brackets is reasonable.

Yeap, I think it's reasonable 👍🏻

alexfertel commented 10 months ago

Yeah, there was a bug in v0.4.0, it should be fixed in v0.4.1.

PaulRBerg commented 10 months ago

All good suggestions for the matching tool, @mds1.

I agree with @alexfertel's feedback on the strictness of matching the modifier and test function names.
Happy to drop unsupported characters like { and } to make the implementation simpler
I would prefer the tool to be less restrictive and allow any additional tests on top of what is defined in the tree file. It should still be possible to look at the order of the tree-related tests while ignoring the added-on stuff. The reason for this is general developer UX. It is often the case that I wrote a dummy test function just to check if something works as expected.

Stateful fuzz (invariant) tests are a bit more complex, and perhaps should have their own spec file?

Indeed. It is an open-ended question if BTT can be used for invariants. We have never applied BTT in the Sablier invariants.

alexfertel commented 9 months ago

Ok, implemented bulloak check 🥳 . Please, @mds1 let me know what you think of it and if you have any immediate feedback.

mds1 commented 9 months ago

Amazing! I should be able to test this in the next day or two

alexfertel commented 4 months ago

Closing since this has been in production for quite a while!

alexfertel / bulloak

feature request: validate that test architecture matches what's expected #19