uncefact / spec-jsonld

Exposing the UN/CEFACT vocabulary as web semantics
https://service.unece.org/trade/uncefact/vocabulary/uncefact/
13 stars 5 forks source link

semantic resolution and content negotiation; URL policy #31

Closed VladimirAlexiev closed 2 years ago

VladimirAlexiev commented 2 years ago

@nissimsan https://github.com/uncefact/vocab/issues/24#issuecomment-1023607802 shows a link: https://service.unece.org/trade/uncefact/vocabulary/uncefact.jsonld whcih is broken. More importantly, the semantic link https://service.unece.org/trade/uncefact/vocabulary/uncefact/ should return different payload using content negotiation.

As a minimum: HTML, JSONLD and Turtle.

It's not bad to also utilize extensions, eg https://service.unece.org/trade/uncefact/vocabulary/uncefact.html https://service.unece.org/trade/uncefact/vocabulary/uncefact.jsonld https://service.unece.org/trade/uncefact/vocabulary/uncefact.ttl

but it's mandatory that the single semantic URL must return the same 3 content types using content negotiation.

Eg see how we did it for the Getty 5 years ago: http://vocab.getty.edu/doc/#Semantic_Resolution


It's crucial to design the semantic URLs in a reasonable way, in order to guarantee their longevity and permanence. Change ("break") them now so you won't have to break them in the future!

nissimsan commented 2 years ago

@VladimirAlexiev , https://service.unece.org/trade/uncefact/vocabulary/uncefact.jsonld isn't broken. But the UN server had a scheduled down period over the weekend, that's probably what you were facing? Pls try again now. The topic of content negotiation has emerged before: https://github.com/edi3/edi3-json-ld-ndr/issues/50 The sad conclusion was that we couldn't do anything about it, as the UN server only offered us a dumb dropfolder and no means to configure content types. @kshychko, @kevinbish, @fak3 might http://vocab.getty.edu/doc/#Semantic_Resolution be inspiration for a work around?

+1 on breaking now, not later. @kevinbish, do we have any wriggle room regarding the "service" subdomain?

kevinbish commented 2 years ago

I will start the discussion with ISU/UNOG ICTS to see what server resources they can provide at the appropriate level.

VladimirAlexiev commented 2 years ago

@nissimsan @kevinbish Conneg is pretty much mandatory. You can start with individual files with extensions (no ext = HTML, with links to .jsonld and .ttl).

But if there's no chance Uncefact can add conneg for you, I think you're better off going back to edi3.org or eg w3id.org/uncefact. @mgh128 what do you think?

nissimsan commented 2 years ago

Good requirement.

Moving off UN servers is not politically possible. The UN logo comes with constraints. @kevinbish, cheering on you that this can be worked out.

nissimsan commented 2 years ago

@kevinbish , assigning you since you already volunteered to run with this.

nissimsan commented 2 years ago

@kevinbish, we need your involvement on this, pls.

nissimsan commented 2 years ago

I raised the same request back in the days: https://github.com/edi3/edi3-json-ld-ndr/issues/48

nissimsan commented 2 years ago

@VladimirAlexiev, we're requesting support for this from the UN technical team. Any additional input you can provide us describing how this is done would be much appreciated.

mgh128 commented 2 years ago

https://httpd.apache.org/docs/current/content-negotiation.html provides some examples if you're using the Apache webserver.

https://www.iana.org/assignments/media-types/media-types.xhtml provides a complete list of Media Types registered with IANA

For example, you'll probably want to support:

text/turtle for .ttl files application/ld+json for .jsonld files

The idea of content negotiation is that you don't need to specify the filename suffix when making the Web request but instead use the HTTP Accept: header to specify the media type(s) you prefer.

The webserver then uses that info (in combination with rules you have configured in its configuration files such as .htaccess) to match those media types with resources of particular filename suffix extensions and serve their content without necessarily changing the filename suffix appearing in the address bar of the browser.

Sorry if you were aware of some or all of the above - only trying to help

kevinbish commented 2 years ago

Will it be possible to configure this via the directory with https://service.unece.org/trade/uncefact/vocabulary/uncefact/ .htaccess only with the correct rules?

(I write to UNECE ISU with you and @VladimirAlexiev and the UN/CEFACT Chair in copy to request the correct [HTTP Content-Type Header] be returned for the directory it may be quicker than way.)

IanWattAU01 commented 2 years ago

As a Bureau Vice Chair I will ensure this matter #31 is known at Bureau.

kevinbish commented 2 years ago

Thank you @IanWattAU01 . Partly, this is one of those things in the UN system; whereby, the more higher-level people asking for something the quicker it happens.

The reason in part that this has taken so long to resolve is that I have no control of the technical resources and have to constantly battle ISU or UNOG daily for simple configuration/changes like this.

I will follow up with Maria on this.

kevinbish commented 2 years ago

UNECE server should now be returning the proper response.

header response

mgh128 commented 2 years ago

Thanks - it's good that it's setting the Content-Type: and CORS headers correctly when requesting https://service.unece/org/trade/uncefact/vocabulary/rec20.jsonld

However, when I tested it just now (using Chrome extension ModHeader to specify an Accept: header value of application/ld+json when requesting the URL https://service.unece/org/trade/uncefact/vocabulary/rec20 ) I still received the HTML version rather than the rec20.jsonld version I'd hoped for.

If you're using Apache webserver, there's some further info here: https://httpd.apache.org/docs/current/content-negotiation.html

Suppose your /vocabulary directory contains two resources:

rec20.jsonld rec20.html

You could add a .htaccess file that looks like this:

AddHandler type-map .var
RewriteEngine on
RewriteRule ^rec20$ rec20.var

then also add a file rec20.var that contains the following:

URI: rec20; vary="type" 

URI: rec20.html
Content-type: text/html

URI: rec20.jsonld
Content-type: application/ld+json

The effect of this should be that a request for https://service.unece.org/trade/uncefact/vocabulary/rec20 will serve the HTML version usually (when the browser usually specifies Accept: text/html ) but instead serves the rec20.jsonld resource when the browser or requestor specifies Accept: application/ld+json .

There are probably more elegant ways to do this - and ideally you'd avoid using .htaccess if some of this can be configured in the main webserver configuration file, e.g. httpd.conf .

kevinbish commented 2 years ago

OK, I will speak with UNOG ISU.

nissimsan commented 2 years ago

@kevinbish, kind reminder

kshychko commented 2 years ago

The cloudfront setup is in progress, but below is just an example that proves that content negotiation is working for the AWS cloudfront + lambda@Edge function + s3 storage:

curl -i -o - --silent -X GET "https://dmvc7xzscpizo.cloudfront.net/Consignment" --header "Accept:text/html" && echo
curl -i -o - --silent -X GET "https://dmvc7xzscpizo.cloudfront.net/Consignment" --header "Accept:application/ld+json" && echo
nissimsan commented 2 years ago

We did this.