International-Data-Spaces-Association / InformationModel

The Information Model of the International Data Spaces implements the IDS reference architecture as an extensible, machine readable and technology independent data model.
Apache License 2.0
64 stars 37 forks source link

w3id URIs do not implement content negotiation properly (i.e., no LOD compliance) #93

Closed clange closed 3 years ago

clange commented 5 years ago

All terms of the information model should be available in all content types that we generate (RDF/XML, JSON-LD, human-readable HTML, etc.) by dereferencing the respective HTTP URIs. That's apparently possible with w3id; see https://github.com/perma-id/w3id.org/blob/master/abdn/policy/.htaccess for an example.

clange commented 4 years ago

Here's how to test whether RDF/XML works properly (that's the basic requirement for LOD conformance): wget -O - --header 'Accept: application/rdf+xml' http://w3id.org/idsa/core

sebbader commented 4 years ago

Hi Christoph,

I usually use curl. But yes, I already have the command line request prepared. Also, I have an .htaccess proposal locally prepared... I am just not yet confident enough that it will cope all cases.

clange commented 4 years ago

Hmm, thinking about it, I think we actually need to change more, and it may be necessary to create further issues from this one. I see that widoco generates the ontology as an all-in-one file in various serializations (JSON-LD, RDF/XML, N-Triples and Turtle) – perfect. While it's not nice, when using slash URIs, that you dump almost 1 MB of, e.g., Turtle on a client (that's no difference from using hash URIs), it's legal and OK for now. So, we have, e.g., https://industrialdataspace.github.io/InformationModel/docs/serializations/ontology.ttl. This is where URL of the form https://w3id.org/idsa/* needs to be redirected when text/turtle is requested. Same for the other serializations, and recall that application/rdf+xml is the common denominator for basic compliance.

A nicer solution would be the generation of one smaller downloadable file per serialization and per namespace. Recall that if a machine client requests, say, RDF/XML from <https://w3id.org/idsa/code/INTERFACE_DEFINITION>, it should be served a file that contains at least one triple (in practice: all triples that we have) with the subject <https://w3id.org/idsa/code/INTERFACE_DEFINITION>. A reasonable approach would be to redirect to <http://...github.io/path/to/idsa/code.rdf>, which would be a file generated from codes/*.ttl. (Oops – we are using "code" in the namespace URI and "codes" in the file system path!) The even nicer approach of redirecting to <http://...github.io/path/to/idsa/code/ContentType.rdf>, generated from codes/ContentType.ttl, would be hard to implement, as we'd have to work hard on mapping URIs to the files in which they are defined.

clange commented 4 years ago

BTW @sebbader not sure what you mean by "coping with all cases". "All cases" as I see them are the following: requesting several content types from several URIs (of terms in different modules of the ontology). With http://linkeddata.uriburner.com:8000/vapour this can be automated beyond what curl and wget can do, plus that you get useful error messages.

sebbader commented 4 years ago

Hello Christoph,

totally agreeing that a fine-grained response is definitely the way to go. The only argument against is the required effort. That would require a complete rework of the whole repository, nothing we can do in the next weeks. Might be a relevant task for the next release... For now, I focus on getting a proper result for any LD-compliant client. The received file is way to big for 95% of cases, sure.

clange commented 3 years ago

I just confirmed that apparently we've fixed this (tested for application/rdf+xml, text/turtle, text/html) but just forgot to close the issue.