frictionlessdata / datapackage

Data Package is a standard consisting of a set of simple yet extensible specifications to describe datasets, data files and tabular data. It is a data definition language (DDL) and data API that facilitates findability, accessibility, interoperability, and reusability (FAIR) of data.
https://datapackage.org
The Unlicense
493 stars 112 forks source link

Extending table-schema types/formats #817

Closed yohanboniface closed 5 months ago

yohanboniface commented 1 year ago

Hi,

We work in the French administration. As you may know, we already use frictionless and table-schema in our tools, see for example https://validata.fr/ or http://schema.data.gouv.fr/. We are now in the process of trying to describe, through table-schema, columns that contain national identifiers (like administrative references, or national company unique numbers, etc.). We want to take this path to push table-schema one step forward as a national official standard. Our current reflection is about how to describe such a column through table-schema, we are not yet talking about tools implementing such property (but our plan at this time is to have a frictionless plugin like frictionless-france that would deal with those formats). And, to make this clear, we are not talking about changing the table-schema standard, but only using its current extension possibilities. At this point, before moving forward, we'd like to have your feedback on what would be the best solution. Here are some of the options we have identified:

Do you have any feedback to provide ? Any other users that have made such types extensions ?

Thanks in advance,

Yohan Boniface with @johanricher and @geoffreyaldebert

yohanboniface commented 1 year ago

Another option that we may consider, even if not having the exact same perimeter would be to allow external foreign keys, i.e. pointing to a resource from another data-packages designed by an URI.

Eg: with a new data-package property:

    fields:
    - name: state-code
    foreignKeys:
        fields: state-code
        reference:
            resource: states-code
            fields: code
            data-package: uri/to/datapackage

or, extending the resource property itself:

    fields:
    - name: state-code
    foreignKeys:
        fields: state-code
        reference:
            resource: uri/to/datapackage/states-code
            fields: code

Thoughts ?

roll commented 9 months ago

I think it might be useful to have a pattern defining best practices for this case