cldf / csvw

CSV on the web
Apache License 2.0
36 stars 6 forks source link

Support for remote table schemas? #32

Closed stschiff closed 4 years ago

stschiff commented 4 years ago

CSV on the Web officially specifies that table schemas can be metadata files remotely located on the Web via a URL, see definition from the spec:

tableSchema: An object property that provides a single schema description as described in section 5.5 Schemas, used as the default for all the tables in the group. This may be provided as an embedded object within the JSON metadata or as a URL reference to a separate JSON object that is a schema description.

Currently, this library doesn't offer this feature.

I can see two possible solutions for now: 1) Implement referencing and loading of remote schemas directly within this library. 2) Make a screening step and inject any remotely located table schemas into the JSON before feeding it to this library.

Naturally, I would prefer option 1. I'm aware there is the caveat that then validation cannot happen offline. One could imagine caching, though. What do you think on this? Can you imagine this feature being added in principle?

xrotwang commented 4 years ago

It looks like option 1 could be implemented rather easily, by adding Schema.fromvalue, along the lines of

def fromvalue(self, v):
    if isinstance(v, str):
        v = requests.get(v).json()
    super().fromvalue(v)
xrotwang commented 4 years ago

Inlining the remote content could then be done post-hoc, i.e. by

Note that with this simple solution, round-tripping wouldn't be possible.

xrotwang commented 4 years ago

@stschiff please have a look at https://github.com/cldf/csvw/pull/33 and let me know whther that meets your needs.

xrotwang commented 4 years ago

@stschiff just merged the PR implementing this. Do you need a release on PyPI with this functionality or can you work from HEAD for the time being?

stschiff commented 4 years ago

Thanks, that's great! I can work from HEAD for now. And thank for inviting me to contribute. I'll work through PRs and Issues of course, should I have something.