Open zaneselvans opened 1 year ago
@zaneselvans thanks for investigating this. We do have an integration test that is supposed to check for valid datapackage's, but clearly it needs to be overhauled. We currently only test the Ferc1 datapackage, and it uses the Package
property metadata_valid
to check for validity, which is maybe insufficient?
IIRC that check will probably only look at the Package-level metadata, and not recurse down into any of the resources which make up the package (which confused the heck out of me when I was first working with the datapackage validations).
This came up again in #242 and I tried to fix it with no luck. I've made the paths relative, but I've run into another error for all of the tables: Please provide "dialect.sql.table" for reading
. I spent some time messing around with the dialect, but couldn't resolve the issue, so I'm going to leave this open for the time being.
The datapackage descriptors we are currently generating to annotate the SQLite DBs which are derived from XBRL data are not valid. For example, in the
ferc-xbrl-extractor
environment running this command:Results in a bunch of errors like:
Or if we try and validate a single resource and return the errors in JSON form:
The problem?
I think the issue here is that we are using v4 of the
frictionless
package, and the ability to annotate SQLite DBs was only introduced in v5. Looking at thedatapacakge.json
file, I see that tiedialect
field is invalid. Infrictionless
v5, it would need to saysql
and and then point at the table within thesql
dictionary. in previous versions it would describe the CSV dialect that's being used in the file that thepath
element points at. See this example of data package annotating an SQLite DB.Frictionless v4 can't interpret
sqlite://
URL as pathAs it is, the system sees the
sqlite://
URL in thepath
and has no idea how to interpret it to find the data.The
sqlite://
path must be relative not absoluteIn addition to being unable to interpret the
sqlite://
URL as a path at all, the URL uses an absolute path rather than a relative path, which is invalid. It is invalid both in that it violates the frictionless data resource specification which says:A “url-or-path” is a string with the following additional constraints:
In addition, the absolute path is simply wrong if you download our nightly build outputs since the path to which the descriptor and databases were written on the build server have no meaning on the user's machine:
What to do?
datapackage.json
file.