frictionlessdata / goodtables.io

Data validation as a service. Project retired, got to the current one at frictionsless/repository
https://goodtables.io
GNU Affero General Public License v3.0
69 stars 16 forks source link

Add support for datapackages #17

Closed roll closed 7 years ago

roll commented 8 years ago

Overview

Current task configuration/descriptor doesn't support datapackages because of structure:

files:
settings:

Ideas

We could introduce attribute datapackages (singular or plural?) wich will be mutually exclusive(?) with files:

datapackages:
settings:

If there will be more than 1 datapackages it will require a few tasks inside one build (based on current goodtables capabilities).

Tasks

roll commented 8 years ago

@pwalsh @danfowler Is multiple datapackages per repository our use case?

pwalsh commented 8 years ago

@roll sure, why not?

roll commented 8 years ago

@pwalsh just to be sure (cc @amercader)

pwalsh commented 8 years ago

one or many data packages also ultimately represents some sort of file tree, and therefore can be reduced to the files: type of interface we use here anyway.

I might be good, as you said, to have:

datapackages:
files:
settings:

All possible as part of goodtables.yml (and not mutually exclusive). Whether this is then all just reduced to a single list of target files, or is a set of different jobs, is mostly just an implementation detail, with various pros and cons either way I suppose.

amercader commented 7 years ago

MVP if time, if not defer

roll commented 7 years ago

requires https://github.com/frictionlessdata/goodtables-py/issues/153

roll commented 7 years ago

@danfowler @amercader So Dan's preparing pilots with datapackage.json but this feature is not even for Alpha (mid of January). Is it correct?

pwalsh commented 7 years ago

@amercader assigned to you for scoping and estimates.

amercader commented 7 years ago

@roll can you add a mini spec and an estimate? You had a plan for this IIRC

roll commented 7 years ago

@amercader done :+1: