yulqen / datamaps

MOVED: Command line interface to yulqen/bcompiler-engine. Used by DfT to collect and clean data using Excel spreadsheets.
https://git.sr.ht/~yulqen/datamaps
MIT License
2 stars 2 forks source link

Rules-based validation in datamaps #15

Closed yulqen closed 1 year ago

yulqen commented 3 years ago

I'm very pleased to see that you're totally revving up to the idea of this stuff, @banillie. It's where the real value of this software is going to come to the fore, I think.

Here are the rules I am tentatively considering at this stage:

More than happy to hear your thoughts and ideas.

Originally posted by @yulqen in https://github.com/yulqen/datamaps/issues/14#issuecomment-769150545

yulqen commented 3 years ago

Comment from @banillie from now-closed issue:

Hi Matt, I think the above pretty much summarises where we got to on the data validation side of things.

Only thing I can add now is regarding the EMPTY type and whether this should generate a FAIL is the value returned is in fact empty.

I think a good way to handle this is via the specification of NOTEMPTY. The user should have to place this in addition to the actual type of value expects e.g. TEXT. The absence of a return value will then generate a FAIL. However, if notempty is not specified then the assumption is that it's ok for the value to be empty. Maybe at this point the validation can flag empty rather than fail? This will be a useful way for the user to prioritise keys that really need to be returned, from those that are nice to have but aren't going to cause major problems.

Relevant in this issue.

yulqen commented 3 years ago

Make the NOTEMPTY validation the priority. NOTEMPTY takes precedence, then tests for other type, such as DATE.