Swirrl / table2qb

A generic pipeline for converting tabular data into rdf data cubes
Eclipse Public License 1.0
13 stars 4 forks source link

Validate input CSV files when reading. #102

Closed lkitching closed 4 years ago

lkitching commented 5 years ago

Issue #37 - Create a CSV parser which validates and transforms data rows according to a declarative specification. Use this parser when parsing input files for the codelist and components pipelines as well as the columns configuration file within the cube pipeline. This parser checks that all required columns are present in the input and no unknown columns are specified.

Rename csv/read-csv-records to read-csv-maps and name the new parser read-csv-records.

RickMoynihan commented 5 years ago

Out of interest @lkitching what do the validation errors look like?

lkitching commented 5 years ago

@RickMoynihan - Cell validation errors have the form Invalid cell in column [column name] row [row number]: [message]. There are other message formats for invalid columns in the header.