uncefact / spec-jsonld

Exposing the UN/CEFACT vocabulary as web semantics
https://service.unece.org/trade/uncefact/vocabulary/uncefact/
13 stars 5 forks source link

Forking the vocab graph #67

Closed nissimsan closed 2 years ago

nissimsan commented 2 years ago

Recent issues (such as https://github.com/uncefact/vocab/issues/42, https://github.com/uncefact/vocab/issues/48 and https://github.com/uncefact/vocab/issues/41) raises bugs which will be easy and trivial to fix directly in the vocab graph, but demand complex special case treatment in the current NDR/programmatic approach. This raises the natural question: what is the value of sticking with the constraint of manipulating the vocab only programmatically? It was a natural place to start, but now that we have a more mature vocab to work from, allowing ourselves to make updates directly in the praph will significantly simplify the path forward.

One aspect to consider is future updates, which will have to be switched from a "transform and replace all" to a "transform increments only, merge into the graph". In theory at least this could be seen as more difficult; however, with the recent influx of issues the idea of doing everything with a "do everything automatically" script no longer seems practically possible.

I suggest we pivot away from the constraint we have unnecessarily forced upon ourselves and allow ourselves the freedom to continue manipulating the vocab graph directly too.

VladimirAlexiev commented 2 years ago

@nissimsan This is a very deep question because it's core to the governance and even "value" of this project. If it doesn't stick to the "traditional" UNCEFACT schemas and its NDRs (with all their legacy baggage), then it becomes "interpretation of the Gospel", not the Gospel itself.

I vote for modernization (thus "breaking with tradition" where warranted), but what do the others think?

nissimsan commented 2 years ago

@VladimirAlexiev, noting that we are in agreement about the direction. I love your analogy!

After having thought about this exact problem for several years I have found it increasingly less controversial:

  1. Interpretation is forced upon us by the technological change, and we are bringing everything over (despite the fact that the CCTS layers don't fit in particularly well on the RDF graph, we did fit them in).
  2. This is entirely a matter of governance (to your point). But it is governance of a different problem domain than what UN/CEFACT are otherwise dealing with, supporting decentral use cases and implementations rather than defining them.
  3. Perhaps forking isn't the right word to use. I mean that as a technical term, but I'm not suggesting that the un vocab is never updated - only how it is updated (the degree of automation, basically).
  4. Stability is key to point 2, and the maturity of the CEFACT model is among its strongest assets - both of which makes the point of supporting updates in rapid frequency difficult (and weird!) to justify.

I suggest that we start today's meeting to discuss this topic, flesh out the pros and cons, and make a decision so we can push forward. It seems pretty clear that we are in a holding pattern until this gets sorted.

nissimsan commented 2 years ago

Discussed on today's call: https://github.com/uncefact/vocab/pull/69

svanteschubert commented 2 years ago

I love the biyearly train model approach of UN/CEFACT (as an interim solution until our sci-fi dream comes true: all dependent software can be generated instantly by some CI/CD process overnight from the blueprint/specification). Every standard (being the blueprint for software) should learn from that approach. On the other hand, yes, certainly the semantic should not change in general, but as usual, a fair balance has to be taken between backwards compatibility and fixing the naturally occurring human mistakes.

Only this centralized semantic model that everyone worldwide agreed upon gives us the ability to throw at any given time data over the digital fence, so the data can be processed from the receiver by automation (without human interaction nor bilateral agreements. This is the core of digitalization.

I am new to this project and surely was not able to understand all implications that come along with this issue, yet I still tend to generate the "syntax" of JSON-LD from a shared centralistic model. If this is a utopian dream, I would like to understand the problems in more detail. Thanks for the interesting web conference today. I (will) be back! :-)

nissimsan commented 2 years ago

Good push back on today's meeting - thank you (especially @cmsdroff and @svanteschubert). We are debating the nature of this project, and I fully accept your viewpoints.

The realistic alternative to what I suggest on this issue is to distinguish the feedback we get on tickets between "Content" and "NDR". Everything content (for example https://github.com/uncefact/vocab/issues/42) will be fed back to the relevant CEFACT modelling teams; we wouldn't even try to fix for example referenceDocumentvs referencedDocument.

Only issues dealing with transformation/syntax/and other NDR-aspects (for example https://github.com/uncefact/vocab/issues/25) will be actually dealt with in the scope of the vocab project.

nissimsan commented 2 years ago

Based on the above, I just took a quick pass through the current issues, tagging the relevant ones with either "NDR" or "semantics":

nissimsan commented 2 years ago

After further thought, digesting last meeting's input, I'm fine to abandon this idea. I suggest we aim to make a quick decision on Friday and get this closed.

cmsdroff commented 2 years ago

@nissimsan For info some of the same questions have been asked by the API JSON schema export tech spec team.

Not sure who is on that team or if how we should align this, but it feels like passing the semantic issues back is the right approach.

Im in the Library maintenance team call but it will likely be rolled to next week due to the volume of submissions. I'll follow this along with the library submission team

nissimsan commented 2 years ago

Decision made - we will not fork, but rather pass back semantic issues to modelling teams. Closing ticket.