Open RickMoynihan opened 5 years ago
I think this should be straightforward enough for the columns-config and for the codelist and component pipelines where table2qb sets the schema. How do you think it might work with the cube-pipeline, where columns-config defines a set of permissible columns (which are essentially all optional for any given table)? Indeed, would the generated data involve a random set of component-properties taken from the columns config?
I'd also be curious to know if we could build the specs from the existing config/ conventions or whether the user would need to provide more information to support this.
I'd suggest we start exploring this with a spec for the columns-config, codelist-csv and components-csv then take it from there...
This is partly a suggestion of how we should do a large part of the validation task. We may not be able to express all the validations easily this way, so it’s not necessarily a complete approach. However it will unlock many other benefits too.
The proposal is that we should add clojure.spec’s which will serve the following purposes:
This will also be a step towards being able to generate randomized but valid cube data for testing all aspects of an RDF/cube stack.
This approach may be required to run deeper than just table2qb, for example we may want the RDF specs to be in grafter.