czan / stateful-check

Stateful generative testing in clojure
MIT License
117 stars 11 forks source link

[Feature request] Add exhaustive mode #18

Closed gabrielgiussi closed 6 months ago

gabrielgiussi commented 1 year ago

The problem

I recently did a pretty large refactor in a codebase that was using already ~20 commands and I was running the specification-correct? assertion while I was refactoring to see if I had all the commands passing the postconditions as expected. At some point the assertion was passing and returning the list of a bunch of commands with their corresponding frequency so everything seemed fine, however the issue was that some commands were not being executed at all, because the require function wasn't being satisfied, but this was not one of the main commands and only one among 20 was unnoticed at first.

Even without a large refactor, a command could stop running and go unnoticed because we usually don't check the cicd jobs output when everything is passing.

A proposal

I think it would be useful to add an exhaustive option to the specification-correct? assertion that will make it fail if some of the commands was not executed. This flag could be turned on locally during refactors to check that every command is being executed or even use it in continuos integration.

Besides that, printing the commands that were not executed (aka frequency 0) even when the test passes and exhaustive is false, could be useful. So you can have at least some sort of "warning" that you are not running all the commands that you've specified but you might be fine with that (and that's why you don't want to turn on the exhaustive flag).

What do you think about this proposal? Does it make sense to you? I could implement it if you think is useful, I've already did a local implementation and helped me to discover broken require functions during a refactor.

Note: My implementation only modified the report-result but if might be misleading to have result being true and then make the assertion fail. Also it is probably hardest to test at this level.

czan commented 1 year ago

Hi Gabriel,

On Wed, Jul 05 2023, Gabriel Giussi wrote:

At some point the assertion was passing and returning the list of a bunch of commands with their corresponding frequency so everything seemed fine, however the issue was that some commands were not being executed at all, because the require function wasn't being satisfied, but this was not one of the main commands and only one among 20 was unnoticed at first.

Ouch! This is exactly the reason I added the command frequencies report option, but it's obvious that it's not well suited to automated tests running in CI.

I think it would be useful to add an exhaustive option to the specification-correct? assertion that will make it fail if some of the commands was not executed.

I agree! That would provide an immediate solution to this problem, but I wonder if adding another boolean option is the best way to go here.

The other option would be to provide a "command-counts-assertion" option (maybe with a better name) which gets called at the end of a test run. We could provide an "all-commands-ran" assertion, to provide the check you're looking for, while also leaving it open to having other assertions in future.

That said, I'm not sure what other assertions would be useful, so maybe that's just over-engineering. 🤷

Besides that, printing the commands that were not executed (aka frequency 0) even when the test passes and exhaustive is false, could be useful.

Yeah, I can see the value in this. I'm wary of adding too much noise to the output, but a note at the end with the names of the commands would probably be helpful.

I could implement it if you think is useful, I've already did a local implementation and helped me to discover broken require functions during a refactor.

If you're happy to implement it and raise a PR, I definitely think it would be useful.

czan commented 6 months ago

Hi @gabrielgiussi,

I've just merged a change from @r0man to add zero counts into the command frequency table. This isn't quite the same as what you want, but it's the bare minimum that you'd need to be able to implement some check in CI. In the worst case, a failing the build if grep finds | *0 *| might help with the initial problem you mentioned.

I've just cut a new release including this change as 0.4.4, so you should be able to use it without much trouble.

I'm going to close this issue, but I would still welcome a PR to add a way to assert on the command frequencies.