Property Based Testing & Specs

This is partly a suggestion of how we should do a large part of the validation task. We may not be able to express all the validations easily this way, so it’s not necessarily a complete approach. However it will unlock many other benefits too.

The proposal is that we should add clojure.spec’s which will serve the following purposes:

Provide an executable specification of what valid input looks like
Input validation (as described above) with error reporting. The spec errors may be unclear for end users of the tool; but they are very precise, and the output could be humanized somewhat with expound, which could be enabled in the app (but not the library).
Property Based Testing (widely held as the gold standard of testing); essentially the specs can be run in reverse to generate millions of “randomized” but specification compliant examples that exercise far more code paths than any human could ever hope to exercise through TDD or BDD approaches.

This will also be a step towards being able to generate randomized but valid cube data for testing all aspects of an RDF/cube stack.

This approach may be required to run deeper than just table2qb, for example we may want the RDF specs to be in grafter.

I think this should be straightforward enough for the columns-config and for the codelist and component pipelines where table2qb sets the schema. How do you think it might work with the cube-pipeline, where columns-config defines a set of permissible columns (which are essentially all optional for any given table)? Indeed, would the generated data involve a random set of component-properties taken from the columns config?

I'd also be curious to know if we could build the specs from the existing config/ conventions or whether the user would need to provide more information to support this.

I'd suggest we start exploring this with a spec for the columns-config, codelist-csv and components-csv then take it from there...

Swirrl / table2qb

Property Based Testing & Specs #91