uncefact / spec-jsonld

Exposing the UN/CEFACT vocabulary as web semantics
https://service.unece.org/trade/uncefact/vocabulary/uncefact/
13 stars 5 forks source link

use `#` or `/` rather than `/#` #25

Closed VladimirAlexiev closed 1 year ago

VladimirAlexiev commented 2 years ago

@nissimsan

All of the vocabs use a double-char namespace delimiter: /# .

W3C and the Linked Data Patterns book (eg https://patterns.dataincubator.org/book/hierarchical-uris.html) have guidance on "slash vs hash" URLs. But nobody recommends using both together.

The recommendation is to use slash for large collections of terms and hash for smaller collections (since a client doesn't send the anchor after hash, so it'd get the whole collection at once).

But please consider #26

nissimsan commented 2 years ago

+1

VladimirAlexiev commented 2 years ago

@nissimsan I don't see /# in uncefact.jsonld, what's up? Guess the web page URLs are made independently of the JSONLD URLs?

nissimsan commented 2 years ago

Good catch. So this might actually just be a surface level issue.

@kshychko, @fak3 - thoughts?

nissimsan commented 2 years ago

Agreed, /# is wrong. Any instance of this should be changed to #.

@kshychko will reach out to her colleague who did the static web publishing. Hopefully this is not too hard to fix - it does need fixing.

nissimsan commented 2 years ago

dr-shorthair, mgh128, this is the ticket for discussion / vs #.

Everyone, it is suggested that we go with / in order to serve smaller files (instead of fragments of the full list): https://github.com/uncefact/vocab/issues/24#issuecomment-1073389923

I would agree that we generally have a problem with the size of the files we serve. However, this seems like a pretty fundamental change which I would be hesitant to support given our resources (good vs perfect, and all that). Down the road, when the code is available, I would approve a PR which does this, but for now I think we should focus and get this closed with minimum effort and implications.

nissimsan commented 2 years ago

https://www.w3.org/TR/cooluris/ ^ Good resource for direction on these sorts of matters.

VladimirAlexiev commented 2 years ago

26 is more relevant why /.

@nissimsan Is it possible to fix URLs to use / now even if they'll be published as big pages with many resources?

nissimsan commented 2 years ago

@kshychko thinks this will be solved as part of https://github.com/uncefact/vocab/issues/26

TallTed commented 2 years ago

@nissimsan @VladimirAlexiev

The double-delimiter /# only shows up following dereferencing redirects from the vocabulary terms (e.g., https://service.unece.org/trade/uncefact/vocabulary/uncefact#carrierParty) to their HTML-based, human-facing description page (e.g., https://service.unece.org/trade/uncefact/vocabulary/uncefact/#carrierParty).

This is odd, but legal. (The /# is a semi-conventional shorthand for /index.html# that relies on the setup of the HTTP server hosting that page. If the HTML filename changes, or if the setup of that HTTP server changes, the /# links may cease to function ... but hopefully all admins and publishers working there will recognize that.)

If you look at the uncefact.jsonld, where the terms are clearly identified, you will see only # delimiters.

You may also notice this @context heading that JSON-LD file, which contents are necessary to fully understand or interpret the @id and @type CURIE values, as well as any other CURIEs on the human-facing HTML page --

  "@context": {
    "schema": "http://schema.org/",
    "uncefact": "https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "cefact": "https://edi3.org/cefact#",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "dc": "http://purl.org/dc/elements/1.1/"
  },

This would all be made far more clear if the human-facing HTML-based term description page included the FULL URIs for those terms in their @id values (e.g., https://service.unece.org/trade/uncefact/vocabulary/uncefact#carrierParty), rather than the current CURIEs (e.g., uncefact:carrierParty) which I believe should be considered to be incorrect, as they use a PREFIX element (uncefact:) which is not defined anywhere visible.

(If the full-length URIs are considered problematic, and it is decided that CURIES must be used in the description blocks, then I strongly advise that at minimum all CURIEs be used as the human-facing text of <a ... /> elements with href values of the full URI.)

(I think the CoolURIs article is not relevant here, until and unless there is affirmative action on @VladimirAlexiev's suggestion and @nissimsan's later concurrence to change from the current "hash-URIs" like https://service.unece.org/trade/uncefact/vocabulary/uncefact#carrierParty to "slash-URIs" like https://service.unece.org/trade/uncefact/vocabulary/uncefact/carrierParty#term, which would violate the tenet that "Cool URIs Don't Change" — which isn't mandatory, even given the huge volume of data supplied when dereferencing https://service.unece.org/trade/uncefact/vocabulary/uncefact#carrierParty — as the existing redirect to https://service.unece.org/trade/uncefact/vocabulary/uncefact/#carrierParty could just become a redirect to https://service.unece.org/trade/uncefact/vocabulary/uncefact/carrierParty or https://service.unece.org/trade/uncefact/vocabulary/uncefact/carrierParty.html which holds the description of only the single term, https://service.unece.org/trade/uncefact/vocabulary/uncefact/#carrierParty.)

nissimsan commented 2 years ago

@TallTed, thank you so much! You make a very convincing argument.

First, there is affirmative action on switching to "slash-URIs". The v1 label on this ticket means we treat this as mandatory before formally switching from "Draft" to "v1" of the UN CEFACT vocab. There are minor uncertainties about our level of hosting flexibility, but it is absolutely our ambition to have this solved before the next UN CEFACT Forum in October. "Cool URIs Don't Change" applies after that point.

Now, where your argument changes my mind is regarding usage of these terms. Without playing back all your arguments, I agree that in fact already today, https://service.unece.org/trade/uncefact/vocabulary/uncefact#carrierParty is the correct URI; https://service.unece.org/trade/uncefact/vocabulary/uncefact/#carrierParty is not.

Thank you for taking the time to explain this thoroughly, Ted! Much appreciated!

nissimsan commented 2 years ago

This is getting fixed with the v1 approach: https://dmvc7xzscpizo.cloudfront.net/classes

mgh128 commented 2 years ago

Can I ask what progress has been made on what was discussed here and in #24 about having a consistent Web URI stem (ending in /) to which any Rec20 unit of measure code such as 'A60' can be appended, to resolve to information about that unit (in this case Watt), including conversion factors? I'm asking because I'm still seeing

https://service.unece.org/trade/uncefact/vocabulary/rec20/#watt

but not yet seeing anything like

https://service.unece.org/trade/uncefact/vocabulary/rec20/A60

providing a useful (Linked Data + human-readable) response, to help anyone to easily decipher a rather cryptic 2-3 character Rec20 code without needing to resort to a lookup in an Excel spreadsheet.

Is something like this still in the pipeline? If it has already been implemented, can you please point us to one or two URI examples wherever they are?

nissimsan commented 2 years ago

Hi @mgh128,

Yes, this is done already, please have a look: https://dmvc7xzscpizo.cloudfront.net/rec20#A60 The code for Watt is WTT: https://dmvc7xzscpizo.cloudfront.net/rec20#WTT

Note that we decided to use # for codelists.

As with most else around here we're eagerly waiting for the UN DNS to be updated. Once that is done, the formal URI will be for example: https://vocabulary.uncefact.org/rec20#WTT

nissimsan commented 1 year ago

We did this