Closed peterdesmet closed 3 years ago
Thanks @peterdesmet,
I'll investigate
Hi @roll any news on this issue? The fact that we can't validate our camera trap DP profile as an extension of tabular-data-package
is currently blocking its release.
@peterdesmet Thanks for heads-up, I'll prioritize this issue
Hi @peterdesmet,
Please try frictionless@4.12
- https://github.com/frictionlessdata/frictionless-py/blob/main/tests/test_package.py#L924-L988
It now supports external profiles although the profile registry is going to be deprecated so it's only for direct profile links local or remote.
Great! My first step was testing it out on an existing data package with typical "profile": "tabular-data-package"
:
frictionless validate https://raw.githubusercontent.com/tdwg/dwc-for-biologging/master/derived/camtrap-dp/data/raw/datapackage.json
# -----
# valid: deployments.csv
# -----
# -----
# valid: multimedia.csv
# -----
# -----
# valid: observations.csv
# -----
But in frictionless@4.12.1 I get:
frictionless validate https://raw.githubusercontent.com/tdwg/dwc-for-biologging/master/derived/camtrap-dp/data/raw/datapackage.json
# -------
# invalid: https://raw.githubusercontent.com/tdwg/dwc-for-biologging/master/derived/camtrap-dp/data/raw/datapackage.json
# -------
===== ====================================================================================================================
code message
===== ====================================================================================================================
error cannot extract metadata "tabular-data-package" because "[Errno 2] No such file or directory: 'tabular-data-package'"
===== ====================================================================================================================
It does work with "profile": "https://specs.frictionlessdata.io/schemas/data-package.json"
, but many existing data package just have a string (e.g. tabular-data-package
) identifying the profile (one from the registry), which should likely be kept for backwards compatibility.
Other than that, "profile": "https://raw.githubusercontent.com/tdwg/camtrap-dp/0.1.3/camtrap-dp-profile.json"
works splendidly! 🎉 This is absolutely fantastic!
Returning errors for camtrap-dp-profile
AND data-package
which it is build on:
frictionless validate test/datapackage.json
# -------
# invalid: test/datapackage.json
# -------
============= ==================================================================================================================================================================================================================================================
code message
============= ==================================================================================================================================================================================================================================================
package-error The data package has an error: "'contribustor' is not one of ['publisher', 'author', 'maintainer', 'wrangler', 'contributor']" at "contributors/0/role" in metadata and at "allOf/0/properties/contributors/items/properties/role/enum" in profile
package-error The data package has an error: "'hello' is not of type 'boolean'" at "multimedia_access/public" in metadata and at "allOf/1/properties/multimedia_access/properties/public/type" in profile
package-error The data package has an error: "'url' is a required property" at "organizations/0" in metadata and at "allOf/1/properties/organizations/items/required" in profile
package-error The data package has an error: "'d' is not of type 'integer'" at "taxonomic/0/count" in metadata and at "allOf/1/properties/taxonomic/items/properties/count/type" in profile
============= ==================================================================================================================================================================================================================================================
Thanks @peterdesmet,
I'll fix it tomorrow morning. The problem that Frictionless is not shipped with Tabular Data Package as it uses a more sophisticated validation approach (kind of object-based using Schema/Resource/etc profiles separately). But now I see that I need to include it.
In general, I think we need to think of slightly reworking the concept of the profile on the spec level as it leads to some variation problems like in https://github.com/frictionlessdata/specs/issues/743. Currently, it lacks composability in my opinion.
@peterdesmet
I'm releasing frictionless@4.12.2
with a fix.
Generally speaking, I would recommend using tabular-data-package
and having tabular-data-resource
on tabular resources is enough to validate it
@roll do you mean: not having tabular-data-resource
at package level, but just indicating your resources as tabular-data-resource
? That makes sense to me, since I assume there isn't much more validation happening at package level for tabular data resources (that is different from data-resource
)?
@peterdesmet
Yes. E.g. for the data-package
profile internally it just drops the resources
JSON Schema rules from the package profile and uses it for every resource individually
Overview
I want to validate my package against an external profile that is an extension of
tabular-data-package
. If I read the specs (https://specs.frictionlessdata.io/profiles/#profile-property) correctly:This can be done with:
However, it seems only very specific values for
profile
are checked, and otherwise it falls back totabular-data-package
? If I validate the same data package with:"profile": "fiscal-data-package"
=> several validation errors, including:"profile": "https://specs.frictionlessdata.io/schemas/fiscal-data-package.json"
=> validIs this expected behaviour? Would it be possible to:
Raise an error if the value for
profile
is not a valid JSON Schema or a string value listed on https://specs.frictionlessdata.io/schemas/registry.json?Please preserve this line to notify @roll (lead of this repository)