ckan / ckanext-validation

CKAN extension for validating Data Packages using Table Schema.
MIT License
30 stars 33 forks source link

Replace goodtables with frictionless-py #62

Closed amercader closed 2 years ago

amercader commented 2 years ago

goodtables is unmaintained and Python 2 only, let's use the great frictionless-py

Depends on #63: this work should be done in a new branch which will we Python 3 only.

Given that frictionless-py v5 is almost ready to go perhaps we should start targeting it from the beginning, what do you think @roll?

The only place where goodtables is called is in the jobs module: https://github.com/frictionlessdata/ckanext-validation/blob/cd06189c3ba6d7c092fcf8c00a8dd961d49c5f22/ckanext/validation/jobs.py#L11

Refactoring this to use frictionless is of course straight-forward but we need to check if the output report changes significantly, and whether it breaks higher integrations like the report rendered in the UI. Just to be clear, I think it's absolutely fine that the report changes (there will be a major version change between them), we just need to document them.

roll commented 2 years ago

Hi, I would start with v4 as this extension also relies (am I right?) on the Report UI component currently working only with v4.

Following migration to v5 will be really subtle like renaming one parameter and updating Report UI component version

roll commented 2 years ago

Conceptually, the main difference between goodtables and frictionless is that the latter has a concept of Inquiry (in v5 we will improve docs):

So you can store an inquiry as a declarative job description

Inquiry -> Report

amercader commented 2 years ago

Done and merged to the dev-v2 branch, which is using v5