frictionlessdata / frictionless-ci

Data management service that brings continuous data validation to tabular data in your repository via Github Action
https://repository.frictionlessdata.io
MIT License
37 stars 12 forks source link

Duplicated primary key entries are reported as errors in framework but validate in repository #44

Closed augusto-herrmann closed 1 year ago

augusto-herrmann commented 1 year ago

Overview

Some checks that fail validation in Frictionless Framework pass validation in Frictionless Repository.

Example

Recently I have added primary keys to a schema. Since then, Frictionless Framework v5.8.1 correctly points out that there are duplicated entries:

validation report

```bash frictionless validate data/valid/datapackage.json # ------- # invalid: brazilian-transparency-and-open-data-portals.csv # ------- ## Summary +------------------+--------------------------------------------------+ | Name | Value | +==================+==================================================+ | File Place | brazilian-transparency-and-open-data-portals.csv | +------------------+--------------------------------------------------+ | File Size | 20.0 kB | +------------------+--------------------------------------------------+ | Total Time | 0.04 Seconds | +------------------+--------------------------------------------------+ | Rows Checked | 183 | +------------------+--------------------------------------------------+ | Total Errors | 1 | +------------------+--------------------------------------------------+ | PrimaryKey Error | 1 | +------------------+--------------------------------------------------+ ## Errors +-------+---------+-------------+------------------------------------------------------------------------------------+ | Row | Field | Type | Message | +=======+=========+=============+====================================================================================+ | 3 | | primary-key | Row at position "3" violates the primary key: the same as in the row at position 2 | +-------+---------+-------------+------------------------------------------------------------------------------------+ # ------- # invalid: brazilian-municipality-and-state-websites.csv # ------- ## Summary +------------------+-----------------------------------------------+ | Name | Value | +==================+===============================================+ | File Place | brazilian-municipality-and-state-websites.csv | +------------------+-----------------------------------------------+ | File Size | 292.2 kB | +------------------+-----------------------------------------------+ | Total Time | 0.207 Seconds | +------------------+-----------------------------------------------+ | Rows Checked | 2880 | +------------------+-----------------------------------------------+ | Total Errors | 40 | +------------------+-----------------------------------------------+ | PrimaryKey Error | 40 | +------------------+-----------------------------------------------+ ## Errors +-------+---------+-------------+------------------------------------------------------------------------------------------+ | Row | Field | Type | Message | +=======+=========+=============+==========================================================================================+ | 1030 | | primary-key | Row at position "1030" violates the primary key: the same as in the row at position 1029 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1182 | | primary-key | Row at position "1182" violates the primary key: the same as in the row at position 1181 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1208 | | primary-key | Row at position "1208" violates the primary key: the same as in the row at position 1207 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1256 | | primary-key | Row at position "1256" violates the primary key: the same as in the row at position 1255 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1267 | | primary-key | Row at position "1267" violates the primary key: the same as in the row at position 1266 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1310 | | primary-key | Row at position "1310" violates the primary key: the same as in the row at position 1309 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1375 | | primary-key | Row at position "1375" violates the primary key: the same as in the row at position 1374 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1479 | | primary-key | Row at position "1479" violates the primary key: the same as in the row at position 1478 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1501 | | primary-key | Row at position "1501" violates the primary key: the same as in the row at position 1500 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1519 | | primary-key | Row at position "1519" violates the primary key: the same as in the row at position 1518 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1534 | | primary-key | Row at position "1534" violates the primary key: the same as in the row at position 1533 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1607 | | primary-key | Row at position "1607" violates the primary key: the same as in the row at position 1606 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1612 | | primary-key | Row at position "1612" violates the primary key: the same as in the row at position 1611 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1622 | | primary-key | Row at position "1622" violates the primary key: the same as in the row at position 1621 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1639 | | primary-key | Row at position "1639" violates the primary key: the same as in the row at position 1638 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1702 | | primary-key | Row at position "1702" violates the primary key: the same as in the row at position 1701 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1741 | | primary-key | Row at position "1741" violates the primary key: the same as in the row at position 1740 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1750 | | primary-key | Row at position "1750" violates the primary key: the same as in the row at position 1749 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1760 | | primary-key | Row at position "1760" violates the primary key: the same as in the row at position 1759 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1787 | | primary-key | Row at position "1787" violates the primary key: the same as in the row at position 1786 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1940 | | primary-key | Row at position "1940" violates the primary key: the same as in the row at position 1939 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1941 | | primary-key | Row at position "1941" violates the primary key: the same as in the row at position 1940 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 1963 | | primary-key | Row at position "1963" violates the primary key: the same as in the row at position 1962 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2022 | | primary-key | Row at position "2022" violates the primary key: the same as in the row at position 2021 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2023 | | primary-key | Row at position "2023" violates the primary key: the same as in the row at position 2022 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2048 | | primary-key | Row at position "2048" violates the primary key: the same as in the row at position 2047 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2081 | | primary-key | Row at position "2081" violates the primary key: the same as in the row at position 2080 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2194 | | primary-key | Row at position "2194" violates the primary key: the same as in the row at position 2193 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2224 | | primary-key | Row at position "2224" violates the primary key: the same as in the row at position 2223 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2299 | | primary-key | Row at position "2299" violates the primary key: the same as in the row at position 2298 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2339 | | primary-key | Row at position "2339" violates the primary key: the same as in the row at position 2338 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2346 | | primary-key | Row at position "2346" violates the primary key: the same as in the row at position 2345 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2347 | | primary-key | Row at position "2347" violates the primary key: the same as in the row at position 2346 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2396 | | primary-key | Row at position "2396" violates the primary key: the same as in the row at position 2395 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2491 | | primary-key | Row at position "2491" violates the primary key: the same as in the row at position 2490 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2561 | | primary-key | Row at position "2561" violates the primary key: the same as in the row at position 2560 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2672 | | primary-key | Row at position "2672" violates the primary key: the same as in the row at position 2671 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2673 | | primary-key | Row at position "2673" violates the primary key: the same as in the row at position 2672 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2723 | | primary-key | Row at position "2723" violates the primary key: the same as in the row at position 2722 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ | 2846 | | primary-key | Row at position "2846" violates the primary key: the same as in the row at position 2845 | +-------+---------+-------------+------------------------------------------------------------------------------------------+ ```

However, Frictionless Repository v2 checks on this workflow do pass validation tests for some reason. Aren't they supposed to be the same?

Expected behaviour

Frictionless Repository was expected to fail validation due to the duplicated primary key violations in two of the csv files.