question on place name, gs1:AI, SGLN extension

gs1 / WebVoc

GS1 Web vocabulary development site

Apache License 2.0

29 stars 6 forks source link

question on place name, gs1:AI, SGLN extension #35

Open VladimirAlexiev opened 2 years ago

VladimirAlexiev commented 2 years ago

Hi @philarcher and @mgh128 , we're working on LD examples for GS1 Forum 2022-02 and have some questions:

<https://id.gs1.org/414/6991230000124> a gs1:Place, schema:BoatTerminal;
  schema:name "Guangzhou port";  ## 1
  gs1:globalLocationNumber "6991230000124"^^gs1:AI; ## 2
  gs1:address <https://id.gs1.org/414/6991230000124/address>;
  gs1:geo geo:22.997018,113.530455.

<https://id.gs1.org/414/6991230000124/address> ## 5
  a gs1:PostalAddress;
  gs1:addressCountry <https://example.org/iso3166/CN-GD>;
  gs1:addressLocality "Guangzhou city";
  gs1:addressRegion "Guangdong province".

<https://id.gs1.org/414/6991230000124/254/123> a gs1:Place, ex:BoatBerth;
  schema:name "Berth 123 at Guangzhou port"; ## 1
  gs1:globalLocationNumber "6991230000124(254)123"^^gs1:AI; ## 3
  schema:containedInPlace <https://id.gs1.org/414/6991230000124>. ## 4

for spots 1 and 4, there are no appropriate gs1: props and I should use schema:, right?
for spot 2, is this the correct way to use ^^gs1:AI ?
for spot 3, is this the correct way? (254) is the AI for "SGLN extension" but using it together with the two IDs in the same string doesn't seem right?
for spot 5, I hope that for subsidiary objects it's ok to use a DL URLs as prefix, then append some word? Should I use # instead of / for this casE?

Cheers!

mgh128 commented 2 years ago

Hi @VladimirAlexiev , @philarcher

@philarcher - please see especially my response to ##5 below - and correct me if you disagree with my suggestion to use the # fragment of a URI as a safer way to construct additional Web URIs derived from GS1 Digital Link URI stems.

Re ##1 and ##4, you can currently use schema:name . When we finish the preparations to submit the work request to update the GS1 Web vocabulary with the extensions to the GLN data model, we expect that the corresponding properties gs1:physicalLocationName and gs1:digitalLocationName will be added, complementing the existing gs1:organizationName (which only applies to instances of gs1:Organization).

Re ##2, I don't think this is correct use of ^^gs1:AI. Like its counterpart, schema:globalLocationNumber , gs1:globalLocationNumber only expects a plain xsd:string - not an instance of the data type gs1:AI. The only property within the GS1 Web vocabulary having a rdfs:range of gs1:AI is gs1:applicableTo - though currently no subproperties of gs1:linkType appear to make use of this. The data type gs1:AI is used as values of the skos:notation property for annotating various properties (such as gs1:bestBeforeDate etc.) to indicate an equivalence to a specific GS1 Application Identifier.

Re ##3, I'd have written it as gs1:globalLocationNumber "6991230000124"; As far as I recall, the pending updates for the GS1 Web vocabulary arising from the extended GLN data model do not add any property for expressing the (254) extension to physical location GLN. I think we probably intend(ed) to add such a property as part of the GS1 Digital Link Semantics work where we were trying to ensure that as far as possible, most GS1 Application Identifiers could map to a corresponding property (and possibly also inferred class) within the GS1 Web vocabulary. Currently we appear not to have supported (254).

Re ##5, I'm not sure that I'd encourage that approach. I understand that you want to avoid using a blank node and may want to refer to that node elsewhere / again. Your proposal of <https://id.gs1.org/414/6991230000124/address> in which you extend the URI path information with an additional component to express a property is probably too likely to confuse users of GS1 Digital Link and is also likely to be rejected by resolvers of GS1 Digital Link. The reason is that the GS1 Digital Link URI syntax uses this pattern for a partially compressed GS1 Digital Link, in which the final component of the URI path information is the compressed string (consisting of characters from the file-safe base64 alphabet as per RFC 4648 section 5 https://datatracker.ietf.org/doc/html/rfc4648#section-5 ) and the two components that precede it indicate a primary GS1 identification key and its value (they remain uncompressed) - so it's possible that in your example, an implementation of GS1 Digital Link compression might attempt to decompress 'address' - though they're unlikely to obtain anything meaningful since most compression strings look fairly random when the binary data is encoded using file-safe base 64. As far as I remember (@philarcher please correct me), the GS1 Digital Link syntax standard remains silent on the use of a # fragment within a URI - so potentially <https://id.gs1.org/414/6991230000124#address> might be acceptable and at least should not cause conflicts with the URI pattern used for partially compressed GS1 Digital Link URIs.

Thanks for asking - and for suggesting potential improvements. We're hoping to submit the work request to add the GLN-related extensions in January 2022, so potentially those could be available just in time for the GS1 Global Forum - and we can put the preview draft (in JSON-LD and Turtle) in this repository when we submit the work request, although there might be some (hopefully only minor) tweaks by the GMD SMG.

justin2004 commented 2 years ago

@mgh128

we expect that the corresponding properties gs1:physicalLocationName and gs1:digitalLocationName will be added, complementing the existing gs1:organizationName (which only applies to instances of gs1:Organization).

What is the appeal of making such specific properties when you can write queries with triple patterns like the following?

?s schema:name ?name ;
    a gs1:Place .

?s schema:name ?name ;
     a gs1:Organization .

When I see something so specific as gs1:physicalLocationName it makes me think someone will think they need ex:secondCousinTwiceRemovedName.

Having a name is having a name no matter what kind of thing you are, right? When you have the predicates do so much work (fixing the relationship and the domain) you lose some ability to generalize in your queries.

mgh128 commented 2 years ago

Hi @justin2004

Specialised properties can be useful for disambiguation between related properties. I've provided a few examples below.

For example, a product may have a familiar name (e.g. 'Disprin') created by its brand owner ( expressed via https://www.gs1.org/voc/productName ) but it might also have a functional name about its purpose (e.g. 'pain relief / anti-inflammatory') ( expressed via https://www.gs1.org/voc/functionalName ) and in some cases may also have a regulated product name ( https://www.gs1.org/voc/regulatedProductName ), which in this example might be something like 'dispersible/soluble aspirin tablets'. All three of these have their uses. For some pharmaceuticals it may be a regulatory requirement to state the regulated product name. There is also value in expressing the functional name, to make the product discoverable by anyone searching for products that match a particular purpose or generic name or category.

Organisations may have a registered legal name as well as a trading name. For example, in the UK, the train operating company with the registered legal name London Eastern Railway Limited ( http://data.companieshouse.gov.uk/doc/company/04955356 ) was originally trading as 'One' then later trading as 'National Express East Anglia'.

Whereas schema.org has a single weight-specifying property for a product, https://schema.org/weight, the GS1 Web vocabulary provides three or four:

https://www.gs1.org/voc/grossWeight https://www.gs1.org/voc/netWeight https://www.gs1.org/voc/drainedWeight https://www.gs1.org/voc/netContent

So if you are trying to provide the weight details for a jar of pickled onions, as a consumer you might be most interested in the gs1:drainedWeight (the onions), less interested in the gs1:netWeight (the onions and the vinegar) and probably only interested in the gs1:grossWeight (the onions, the vinegar and the glass jar) if you needed that value to calculate the postal charges or were concerned about the weight of your shopping bags if planning to buy multiple jars.

The jar of pickled onions is a good example where there can be a significant difference between these values. For example, a drained weight of 220g but a net weight of 440g and a gross weight of 670g.

Likewise, for dimensions, the GS1 Web vocabulary provides separate properties for 'in package' depth, height, width, diameter as well as for 'out of package' depth, height, width and diameter. In contrast, schema.org does not make this distinction - so you don't really know whether a schema.org/width includes the width of the packaging material or not.

Where the GS1 Web vocabulary provides more specialised properties, we have usually declared these to be rdfs:subPropertyOf the corresponding generic property defined by schema.org - and through this, it's easy to SPARQL CONSTRUCT / INSERT an extra set of triples - although if several of our related specialised properties were specified simultaneously, you might be confused if you find that the inferred schema:weight is multi-valued for that jar of pickled onions.

If you prefer not to make use of the specialised properties, that's your choice - but it's probably not a choice made by over a million brand owners / manufacturers globally who have a need for such specialised properties and more specific definitions agreed over the course of many years through a consensus-based standardisation process, GS1's GSMP.

VladimirAlexiev commented 2 years ago

@mgh128 gs1:physicalLocationName and gs1:digitalLocationName

(written before I saw @justin2004) Please NO! Just go with gs1:name, why have 100 different name props?

productName vs functionalName vs regulatedProductName

That's a different need.

(written before I saw @mgh128's example from CompanyHouse) I've seen it for companies as well, eg legalName, formerName, tradeName ("doing business as").

Mark, all examples you give are of needs for semantically different props, not props applied to different classes

Do make different props for different purposes
Don't make different props because of different domain classes

gs1:organizationName DOES NOT distinguish whether it's legalName, or tradeName, etc. It just unnecessarily tied it up to Organization.

you don't really know whether a schema:width includes the width of the packaging material or not

That's not the point. The point is that if I need to express the height of a container, ship, or building, I cannot do it with gs1. Because its props are over-specialized to specific classes.

Do you dispute that for pretty much any kind of object, you need a common "name" for display purposes? Witness rdfs:label and skos:prefLabel (which is intentionally left without domain).

Schema has a lot of "universal" props, see https://schema.org/Thing

@philarcher perhaps GS1 needs a custom root gs1:Thing (rather than owl:Thing) to put such props in?
I personally dislike such abstract root classes, but if you need it for display purposes, that's ok.

rdfs:subPropertyOf the corresponding generic property defined by schema.org - and through this, it's easy to SPARQL CONSTRUCT / INSERT an extra set of triples

Hmm, the ontological mappings from GS1 to Schema are a whole different topic and not as simple as you intimate.

consensus-based standardisation process, GS1's GSMP.

And that process didn't think that Places want to have name, no less than Organizations do? We both know that committees are sometimes blind ...

gs1:AI is used as values of the skos:notation

Like gs1:globalLocationNumber skos:notation "414"^^gs1:AI?

I agree with you, but I think @philarcher's page shows that all identifier-valued props have that datatype (range)?

no property for expressing the (254) extension to physical location GLN

Agree with you. Then I'll use (254) only for ReadPoints, but each "bigger location" (eg Berth) will use GLN only.

https://id.gs1.org/414/6991230000124/address is likely to be rejected by resolvers of GS1 Digital Link.

Ok, will use #

GS1 Digital Link syntax standard remains silent on the use of a # fragment

The server (GS1 or other resolver) won't see the fragment. So @philarcher: DL should document that if someone wants to add suffixes to DL URLs, they should use #.

Why do that:

justin2004 commented 2 years ago

Hi @mgh128 ,

Thanks for the examples.

The case for gs1:productName, gs1:functionalName, and gs1:regulatedProductName is a bit different than the case for gs1:organizationName, gs1:physicalLocationName. and gs1:digitalLocationName.

The former don't fix their domain as tightly. Also they each do mean something a little more specific than schema:name. I do feel like they make sense as separate properties.

But the latter only seem to be proposed to fix their domain tightly and they mean the exact same thing: schema:name.

The weight and dimensions properties are interesting because I think they do match how people talk about things but it isn't quite the same jar of pickles that has a gross, net, and drained weight. In each state, some matter has been added/removed through a process.

Maybe pickles could exist in various states: -a jared state -a jared and submerged state -a jared and submerged and packaged state

Each of those states would still have a schema:weight.

But I don't feel strongly enough about this distinction at the moment. :)

mgh128 commented 2 years ago

Hi @VladimirAlexiev

I don't dispute that some of our existing properties in the GS1 Web vocabulary may have been too restrictive in their rdfs:domain. We have recently made some updates to the linkType properties for GS1 Digital Link to make those more generally applicable to various kinds of thing - not restricted to products. We're also making a similar update to make gs1:CertificationDetails more generally applicable. I'm sure that there are other terms where we could either broaden the rdfs:domain or question whether we need a related term defined per class. Part of the historical background is that the GS1 Web vocabulary straddles the gap between schema.org and the GDSN data model (which does have a tendency to define such terms per class). We do some things in a less clunky way than in the GDSN data model but there is probably room for further improvement.

However, we don't currently have a work group tasked with reviewing and updating the GS1 Web vocabulary, so currently any updates need to be triggered as formal work requests (submitted via https://wr.gs1.org/ then reviewed by the GS1 Global Master Data Standards Maintenance Group) - and are done on a piecemeal basis. Not ideal but that's where we are currently. I don't have any authority to unilaterally change things in the GS1 Web vocabulary - nor do I have direct access to the GS1 webserver.

If you and others feel strongly that there are significant improvements to be made, then please discuss with @philarcher about whether a mission-specific work group should be reconvened within GS1 to discuss those potential updates. Alternatively, you're very welcome to propose a modified version of the Turtle / JSON-LD file, which Phil and I can then review and use as the basis for submitting a work request for those updates - though we will probably need to find some end-user companies who are willing to support the change (which means that they need to understand why the proposed changes are a good thing to do). At present, my main priorities for GS1 are EPCIS/CBV 2.0, TDS/TDT 2.0 and updates to the GS1 Digital Link toolkit. Phil and I handle the technical updates for the GS1 Web vocabulary but only when formally requested by a work group and approved by the GMD SMG.

justin2004 commented 2 years ago

As I think more about it...

Having properties like: gs1:functionalName, and gs1:regulatedProductName make it hard to talk about the the set of anti-inflammatory drugs/products. That is, I think gs1:functionalName and gs1:regulatedProductName are bottoming out in strings too quickly.

An alternative representation:

ex:brandName=Disprin a gs1:Product ;
    gs1:productName "Disprin" ;
    gist:isCategorizedBy ex:somethingSomethingFunction=anti-inflammatory ;
    gist:isCategorizedBy ex:somethingRegulatedSomething=soluble-aspirin-tablets .

Where ex:somethingSomethingFunction=anti-inflammatory and ex:somethingRegulatedSomething=soluble-aspirin-tablets live in a taxonomy.

So I think I am saying that gs1:functionalName and gs1:regulatedProductName aren't schema:names of the product, but rather they are schema:names of the taxonomical thing that categorizes the product.

gist:isCategorizedBy

VladimirAlexiev commented 2 years ago

Another example of lack of universal props: I can express GDTI in schema (though a bit inelegantly), but not in GS1. See first 3 lines below::

<https://id.gs1.org/253/09535600000290000234> a gs1:CertificationDetails;
  schema:identifier 
    [a schema:PropertyValue; schema:propertyID "GDTI"; schema:value "09535600000290000234"];
  gs1:certificationIdentification "2-234";
  gs1:certificationAgencyURL <$PGLN_COMPANY3>;
  gs1:certificationStandard "EU Energy Efficiency, Directive 2017/1369";
  gs1:certificationType <https://example.org/resource/certs/EU-energy-efficiency>;
  gs1:certificationValue "A"; # also see https://schema.org/EnergyEfficiencyEnumeration
  gs1:certificationSubject <$GTIN_PRODUCT1>;
  gs1:certificationAuditDate "2020-11-20"^^xsd:date
  gs1:certificationStartDate "2020-11-23"^^xsd:date;
  rdfs:comment "Use Content Negotiation to get RDF, PDF, HTML";
  gs1:certificationURI <https://id.gs1.org/253/09535600000290000234>. # self-link

(Explanations):

IMHO CertificationDetails should be a subclass of Document since a certificate is a document
ergo I want to use GDTI as global URL of a certification
095356 is the issuing agency prefix, 2 is cert type, and 234 is serial number.
Therefore 2-234 is the internal document id
self-link: can https://id.gs1.org/ serve PDFs?? Yes it can, if it forwards the complete request including Accept header, to the registrant's server

mgh128 commented 2 years ago

Hi @justin2004

The GS1 Web vocabulary also supports categories for products / trade items via the following properties:

https://www.gs1.org/voc/gpcCategoryCode https://www.gs1.org/voc/gpcCategoryDescription

https://www.gs1.org/voc/additionalProductClassification which points to a class https://www.gs1.org/voc/AdditionalProductClassificationDetails containing the following properties: https://www.gs1.org/voc/additionalProductClassificationCode https://www.gs1.org/voc/additionalProductClassificationCodeDescription https://www.gs1.org/voc/additionalProductClassificationValue

So I agree with you that structured category / classification codes (such as UNSPSC and GS1's Global Product Classification (GPC) [ https://www.gs1.org/standards/gpc and https://gpc-browser.gs1.org/ ] ) are a better approach, the GS1 Web vocabulary also supports gs1:functionalName expecting a language-tagged string value.

In a similar way, the GS1 Web vocabulary supports a https://www.gs1.org/voc/ingredientStatement but also something more structured, via https://www.gs1.org/voc/FoodBeverageTobaccoIngredientDetails

mgh128 commented 2 years ago

Hi @VladimirAlexiev

GS1 Digital Link URIs can be configured to redirect to any kind of Web resource of any Media Type, so redirecting to a PDF certificate is perfectly OK - and you can use the https://www.gs1.org/voc/certificationInfo link type to specify that link. If the GS1 Digital Link URI is for a document identified by a GDTI that identifies the certificate, then the https://www.gs1.org/voc/defaultLink could also be specified with the same value as for the https://www.gs1.org/voc/certificationInfo link

philarcher commented 2 years ago

There's been a lot of discussion here and, as ever, Mark has provided his insights. For background, Mark worked on the original WebVoc which was several years before I showed up here. I have found that whenever I try and apply purist Linked Data approaches (like always using existing terms and URIs where possible) Mark has a solid reason for why it's not right in this circumstance. Typically this comes from the fact of the voc's genesis as an RDF expression of the underlying GSDN/GDD data model - which I accept can be frustrating. The other issue is less controversial: the semantics of schema.org are, shall we say, imprecise? And sometimes we need/want to encourage precision.

On the specific point raised about extending the DL URI with more path elements or using a fragment: as I think everyone here recognises, adding sth like /address to the end of a DL URI is not wrong since you're essentially minting a new URI that is never intended to use used as an item identifier. However, for the reasons Mark states, using fragment IDs would be less likely to trigger confusion and errors we can avoid.

Hmmm... this is an old problem. A wiki page at https://www.w3.org/wiki/HashVsSlash might be instructive here. If the address data is typically in the same resource then using the hash is preferred. If the address really is a separate resource, then / is preferred.

@mgh128 is correct - the DL spec remains silent on the issue.

On the domain of gs1:linkType - we relaxed that completely when we added the new link types that came from the work on GLN modernization. The ratified version of the Web Voc now, deliberately, has no domains for link types. The gs1:AI data type is there for our AIs so it's for triples like gs1:gtin skos:notation "01"^^gs1:AI

We could have a generic top level class of 'identified item' but I'm not sure that would really be an advance on just using owl:Thing??

Separately, the Web Voc is now getting more attention, and is being seen as more of an integral part of the GS1 system, that ever. It's managed, formally, by the GMD SMG which takes into account the full range of GS1 vocabulary terms which I'm comfortable with. A new WG just to update the WebVoc might be warranted - but I fear it might also be frustrating because of the need to align with other bits of GS1 where Sem Web and LD are not part of the thinking process. I don't have a definite answer on this one. As ever, it's about community interest and available resources.