Closed svanteschubert closed 1 year ago
Please allow me to drop some basic questions and draft ideas, which I had not the time to discuss and distil to a far shorter and more didactic comment.
A language (or semantic) becomes more powerful the more it is being used by others! Our all-goal should be to unite and cooperate with other international groups on sharing semantics (transforming syntaxes) and specifying semantics in an easy and interoperable way.
When I look at https://service.unece.org/trade/uncefact/vocabulary/uncefact/ The types seem to be autogenerated. The only information software developers can extract/reuse for search (and transformation) are
We might want to consider, before starting a new URL that no one used before (and UBL will never use for political reasons) like https://service.unece.org/trade/uncefact/vocabulary/uncefact/#CargoInsurance
to use existing references for our tags. For instance, by reusing the following public available two URLs as tags (or types) instead: https://en.wikipedia.org/wiki/Insurance https://en.wikipedia.org/wiki/Transport UBL people would likely be far more willing to join the "shared semantic water" with us! Some might say too vague not under our control, but be honest this would be far better than what we have now and will ever have! :-) And these are exactly the URLs I would personally lookup and reference if I do not understand a term!
Finally, how would we ask for "CargoInsurance" at the UN/CEFACT Birthday party, playing the yes/no question & answer game?
You might realize the semantic identification problem was now simplified by dividing it into these yes/no questions. Allowing software developers to sort the data easily by data structures like binary trees. There are follow-up problems, iterations on this, but the results would be far more reusable for software engineers
@svanteschubert , a very capable search function. Pls give that a go. If you still miss features, don't be shy to re-open or raise another issue. 👍
First of all, this is just a quick draft of an idea, that I mentioned in today's meeting. It certainly needs some iterations and clean-up. Please show some mercy to me if it is hard to read and not easy to understand. If too much to ask, just ping me directly and let's discuss during a tea break. Things can evolve much faster in quick dialogue. This idea comes from endless hours in CEN TC 434 WG1 discussing over and over again if a new semantic should really be part of the EU core invoice or better become an extension.
Problem The UN/CEFACT editor Gerhard Heemskerk explained to me that one of the main editor's tasks is to prevent existing data to be added twice (at different places). The problem seems to me to locate the data easily. A similar problem is known to everyone when sorting files by directories, creating a tree of directories to find (or place) the correct file. After some time there are files that might belong in multiple directories. One way of solving this is by tagging the files with all the fields they belong to and doing a dynamic sorting afterwards.
Another use case is from customers for the UN/CEFACT data, who like to find existing data and do a query. What transport container is UN/CEFACT already identifying and what are their volumes?
Suggestion of Solution Remember the child game, where one child is thinking of something and the other is trying to guess it after asking questions, which can only be answered by yes/no?
Similar UN/CEFACT data can be classified by such metadata (binary bit representing yes/no). From a software standpoint, the question/answer pairs can be viewed as traversing a binary tree. Dependent on the yes/no question the left/right branch is being taken.
How about adding boolean classifications/types to the UN/CEFACT data, which allow domain experts to traverse the data to find what they are looking for? To allow them to check if data already exists before asking to add them?