Continuous Data Integration!
It goes like this:
scripts/test_data.py
.travis.yml
Here's some example of actual builds:
We also list the sample outputs below. You can configure error output to be in JSON or CSV rather than TXT.
This "bad" CSV (won't get any dinner):
Date,Country,Number,
2015-01-01,3,20.3
2015-02-01,United States,23.5,,
2015-02,United States,x23.5,,,
,,
Results in:
+----------------+---------------+-------------+-------------------------------------------------------+
| result_name | result_id | row_index | result_message |
+================+===============+=============+=======================================================+
| Missing Header | structure_001 | 0 | Headers column is empty. |
+----------------+---------------+-------------+-------------------------------------------------------+
| Defective Row | structure_003 | 0 | The row dimensions are incorrect compared to headers. |
+----------------+---------------+-------------+-------------------------------------------------------+
| Defective Row | structure_003 | 1 | The row dimensions are incorrect compared to headers. |
+----------------+---------------+-------------+-------------------------------------------------------+
| Defective Row | structure_003 | 2 | The row dimensions are incorrect compared to headers. |
+----------------+---------------+-------------+-------------------------------------------------------+
| Empty Row | structure_005 | 3 | Row is empty. |
+----------------+---------------+-------------+-------------------------------------------------------+
| Defective Row | structure_003 | 3 | The row dimensions are incorrect compared to headers. |
+----------------+---------------+-------------+-------------------------------------------------------+
This "bad" CSV (must be punished):
Date,Country,Number
2015-01-01,3,20.3
2015-02-01,United States,23.5
2015-02,United States,x23.5
Results in:
+----------------+----------------+-------------+-------------------------------------------------------------+
| result_name | column_index | row_index | result_message |
+================+================+=============+=============================================================+
| Incorrect Type | 0 | 2 | The value "2015-02" in column "Date" is not a valid Date. |
+----------------+----------------+-------------+-------------------------------------------------------------+
| Incorrect Type | 2 | 2 | The value "x23.5" in column "Number" is not a valid Number. |
+----------------+----------------+-------------+-------------------------------------------------------------+