ckan / ckanext-validation

CKAN extension for validating Data Packages using Table Schema.
MIT License
28 stars 33 forks source link

upgrading to latest goodtables/frictionless-py #54

Open wardi opened 2 years ago

wardi commented 2 years ago

Overview

We'd like to work on upgrading ckanext-validation's goodtables dependency to a recent frictionless-py release so that we can work on adding l10n support to both this project and frictionless-py. Interested to know if there are any reasons that we can't upgrade this dependency.


Please preserve this line to notify @amercader (lead of this repository)

amercader commented 2 years ago

That's great news! And something we definitely would like to see but haven't been able to find the time for. My overall feeling is that there would be no major issues upgrading to fricitonless-py but @roll is the expert here and maybe able to point things to consider.

There are two main ways in which this extension leverages frictionless tools:


Not directly related but just FYI, the two other main pieces of work I'd like to get done at some point is finalizing the Python 3 / CKAN 2.9 migration (#55) and once that is done, refactoring 1) the way validation jobs are invoked from resource creates/updates (which is really ugly) and 2) leverage package_revise when available to avoid race condition issues when updating the resources with the results

wardi commented 2 years ago

frictionless-py dropped python 2 support when it was renamed from goodtables and sadly we're a ways off from migrating our site to Python 3.

Now considering using the last goodtables release and doing the l10n work there for ckanext-validation. I'll need to find a matching goodtables-ui version of course. @roll would you accept new features on goodtables 2.5.4? It looks like there were major changes to the way error strings are stored so I don't think I can reasonably work from main branch and backport to 2.5.4.

roll commented 2 years ago

Hi @amercader @wardi,

I think migration from goodtables to frictionless is reasonably simple. I can help with the mapping of the options (we anyway are going to write this migration guide for the frictionless docs). And for a new report, you can just use https://components.frictionlessdata.io/?path=/story/components-report--invalid (as Adria mentioned) so the migration is already covered client-side.

We accept fixes to goodtables so it's not a problem if you really can't migrate to frictionless yet (which is of course recommended because of a lot of fixes and improvements). Another question is that it might be more convenient to create a fork of goodtables (even putting it on PyPi) because honestly, we're really low on resources for supporting old libraries (our focus next year will be making frictionless fully robust/performant/etc) so I just don't to want to block you.

For example, we have an external team on the Frictionless org working on other old libraries - https://github.com/orgs/frictionlessdata/teams/while-true-industries/repositories. So you'd like we can do something similar for goodtables.

roll commented 2 years ago

BTW we're currently working on a new generation of visual components that will allow providing a simple way to integrate validation options like editing Schema or validation Inquiry for the users

wardi commented 2 years ago

@roll Thanks a fork of goodtables would suit us nicely. I can't access https://github.com/orgs/frictionlessdata/teams/while-true-industries/repositories but if you would like to create the fork on the frictionlessdata org I'll make my pull requests against that version.

roll commented 2 years ago

Hi @amercader @wardi,

I've created a fork - https://github.com/frictionlessdata/goodtables-py - and a CKAN team (you're invited too) with the maintain permissions on it.

I can assist going forward with setting up builds and releases.