Tracing Chemical Properties

OR13 commented 4 years ago

Continuing discussion from: https://github.com/transmute-industries/traceability-vocab/issues/1

TallTed commented 4 years ago

@msporny re: https://github.com/transmute-industries/traceability-vocab/issues/1#issuecomment-703129256

how stable is dbpedia.org? What's the governance model around it? If we started using terms in it for international standards, what do you expect the arguments against it would be?

The DBpedia ontology is somewhat more stable than DBpedia instance data. While both are based on the crowd-sourced Wikipedia, the ontology is not auto-generated; rather it is independently crowd-sourced via the Mapping wiki. I don't believe there is now an official "governance model around it", though such might be coming as the DBpedia Association continues to evolve.

@OR13 re: https://github.com/transmute-industries/traceability-vocab/issues/1#issuecomment-703119447 and https://github.com/transmute-industries/traceability-vocab/issues/1#issuecomment-703130650

I started with @OR13's link to CHEBI's page on "elemental chromium", where I quickly found the more relevant to this question CHEBI's page on "chromium atom" -- which has "Database link" to WebElements page about Cr, which has its own Atomic Number definition...

(It might be worth reaching out to the WebElements folks about RDFizing their stuff, and so providing a stable URI for all elemental attributes/properties/predicates...)

I also found the PeriodicTable ontology which includes somewhat outdated elemental instance data, and lacks definitions for most attributes, but which URIs appear to be relatively stable.

OR13 commented 4 years ago

yes, the DAML Periodic Table is way outdated.... so far I have found CHEBI to be the best.

OR13 commented 4 years ago

I have concluded that inchi / inchikey are much better to use than atomic number.

OR13 commented 3 years ago

@TallTed do you have any suggesting for improving this part of the vocab?

So far it seems like CHEBI is the best, but sadly its IRIs are not resolvable... I would have preferred both.

TallTed commented 3 years ago

@OR13 said --

I have concluded that inchi / inchikey are much better to use than atomic number.

From which vocab did you get those attribute names? Presumably, you can provide the links ...

and --

So far it seems like CHEBI is the best, but sadly its IRIs are not resolvable... I would have preferred both.

I've quite lost track of what needs to be identified (and don't even have here an example of the CHEMI IRIs that "are not resolvable", which would at least offer an opportunity to find a resolver for them).

Discussion here and in https://github.com/transmute-industries/traceability-vocab/issues/1 has ranged far afield, and been stalled for 6 months, since I provided links to several possible ontologies which appear not to have passed muster, but there has't been much if any comment on why/how they fell short, which would be important input to finding other candidates.

Elements could be referenced simply by their DBpedia URIs (e.g., http://dbpedia.org/resource/Chromium), or by any of the URIs which redirect there (http://dbpedia.org/ontology/wikiPageRedirects), or by some of the http://dbpedia.org/ontology/wikiPageExternalLink (or URIs found there), or....

OR13 commented 3 years ago

This was the best I could do previously:

https://identifiers.org/CHEBI:18248

Which maps to "inchi": "InChI=1S/Fe"

TallTed commented 3 years ago

It also maps to InChIKey : XEEYBQQBJWHFJM-UHFFFAOYSA-N

And that leads to a number of possible identifier sources. There's also a shorter list of candidates.

One of which led me to https://webbook.nist.gov/cgi/inchi/InChI%3D1S/Fe

There may well be better options for identifying whatever actually needs to be identified ("chemical properties" seems likely to cover things like "melting temperature", "combustion temperature", "hazardous material class", much more than simply "elemental composition" or "chemical composition").

OR13 commented 3 years ago

Yes, today we allow chemical properties to be expressed as either:

https://w3c-ccg.github.io/traceability-vocab/#inchi

In steel supply chain we mostly look at percentage of elements, vs complex molecules, but I believe inchi / inchikey effectively support both:

http://identifiers.org/CHEBI:69478

We looked at CAS and ChemSpider as well, but inchi seemed best.

Generally speaking the chem / bio community is miles ahead of the mechanical engineering side for this stuff...

For mechanical properties the best we can currently do it try to map to ISO standards... See also this thread:

https://github.com/qudt/qudt-public-repo/issues/247

I think generally speaking we are interested in all properties of matter that might impact a manufacturing process, including elemental composition, hardness, melting temperature, etc...

When you import a large volume of unfinished product, you want to make sure it won't break your machines, or break in the bridge or vehicle after your machines have finished with it.

TallTed commented 3 years ago

I think generally speaking we are interested in all properties of matter that might impact a manufacturing process, including elemental composition, hardness, melting temperature, etc...

When you import a large volume of unfinished product, you want to make sure it won't break your machines, or break in the bridge or vehicle after your machines have finished with it.

This is a huge space. Trying to define or even map all properties in advance is a losing proposition. A pharma manufacturer is not going to have the same concerns as a packaging manufacturer or a vehicle manufacturer or a furniture manufacturer or a foodstuffs manufacturer, etc., etc. Procurers of these things, actual users of these things, warehousers of these things, etc., are also going to have different concerns -- and with each different set of concerns, different properties will matter.

Random (incomplete, off-the-top-of-my-head) example.... A producer of refined medicinal-grade oxygen will need ways to express permissible contaminants (in both quantity and kind, to varying degrees of specificity). They will need to express how the oxygen is (or may be) stored, and transported, and stored again, and deployed. Materials used in storage vessels will matter. Materials used in transportation vessels (which may be different than storage, on either end) will matter. Materials used in moving the gas from storage vessel to transportation vessel to storage vessel to consumption mechanism, will all matter. Some materials are volatile or combustible in presence of refined oxygen -- but are neither in "air". Only a niche expert is really going to be able to express all of the relevant concerns -- and it takes years if not decades or longer to get from zero to the various handbooks and guidelines such a niche expert would refer to, rather than relying on their own (guaranteed to be flawed) memory -- and new things will come up over time.

Some liquids can be stored in virtually any volume, in a simple kind of vessel. Some liquids require different kinds of vessels as the quantity increases, or as the temperature changes, or as other aspects of the container's environment change.

There will be endless special cases, all of which must be catered to for this traceability vocab to do its apparent currently intended job, as apparently currently designed. I am therefore suggesting some reconsideration of that design.

OR13 commented 3 years ago

@TallTed I agree its a huge space... we are not trying to cover the entire space of semantically unambiguous supply chain VCs from trusted ontologies... we are showing how to do it for a limited set of use cases in a way that is interoperable.

https://w3c-ccg.github.io/traceability-vocab/#use-cases-and-requirements

We know there are endless edge cases in an open world model, that doesn't mean we need to give up on verifiable data in an open world model.

I am interested in a concrete counter proposal though... How would you model JSON-LD VCs for Proof of Origin Credentials for:

ecommerce
oil and gas
agriculture
steel
software
...

We started with the following requirements:

JSON-LD VCs
Rely on existing vocab as much as possible (GS1 / schema.org).
Use the best available ontologies to describe the highest value credentials.

What could we have done better?

msporny commented 3 years ago

What could we have done better?

Perhaps traceability vocabularies focused on each industry? Or thinking about how you're going to separate the work when/if that happens?

... and to argue against separation... schema.org is 1,700+ terms and counting, and seems to be scaling well. Is traceability bigger than schema.org? My gut would say no, but then again, I'm no expert.

OR13 commented 3 years ago

We have always expected to eventually do some context splitting... but as I have argued many times... splitting too early is a bad idea.

https://martinfowler.com/bliki/MonolithFirst.html

Security Vocab started as a monolith as well... and we are just now getting ready to split it up years later.

If we want there to be any commonality between contexts, we ought to be frustrated by looking at the kitchen sink or as long as possible... particularly:

Postal Addresses
Organization Branding
GS1 / GLN / GTINs

Chemistry is an interesting question... should oil and gas and agriculture use the same chemical ontologies or different ones?

Does it matter that I could import trace metals in a pharmaceutical product?

IMO its a good thing that things feel a bit crammed, and we have a pressure to reuse and generalize as much as possible.... the growth rate for split context approach would be higher (less barrier to add) and quickly lead to duplicate work.

TallTed commented 3 years ago

Is traceability bigger than schema.org?

Based on what's been said about [supply-chain] traceability?

Yes.

Backed by several 500lb gorillas in the industry, schema.org has been around for several years now --- and yet, it still has incoherent, inaccurate, ad inconsistent domain/range statements, which would seem to me to be one of the basics in ontology design, and which linger years after some of them have been reported.

UMBEL and several other attempts at "one ontology to rule them all" have been abandoned, partly because the weight of the gorillas' backing made schema.org the one to use ... even though it's technically leaps and bounds behind many of those other efforts. And partly because the folks working on them realized that they (and the Linked Data / Semantic Web) would be better off helping improve niche ontologies, where the niche experts could provide the knowledge necessary to build accurate and consistent range/domain and related details.

How would you model JSON-LD VCs for Proof of Origin Credentials for:

ecommerce

oil and gas

agriculture

steel

software

I would start by dropping the JSON part of your initial requirement. The VC WG specified things in such a way that, while JSON-LD is likely to be the preferred format for some years, it is not required, as long as the Abstract Data Model can be losslessly and deterministically translated to and from whatever new-kid-on-the-block format arises.

LD can be transmitted through numerous formats, as has been discussed rather a lot in the DID WG, and while CBOR-LD isn't ready for prime time, it and/or other more compressed/compressible formats are likely to be preferable in many scenarios, where QRCodes or similar scannable devices have limited volume for the encoding of necessary info (even if those are just links to cryptographically verifiable documents).

Then, I'd try to break those things in your bullet list up into smaller components, util there was something like the lowest level of inputs to all areas. Oil&Gas requires Steel, both of which are required by Agriculture, all of which might be supported by Software, all of which might be transferred via transactions inn the ecommerce space (presuming that your definition of ecommerce includes Amazon which deals substantially in brick-and-mortar merchandise)....

Then I might contact the DoD Business Mission Area, to see whether they could, and if so would, share some insights as to the ontology/ies and other tools employed in their Enterprise Information Web which delivers Business Intelligence information to the highest levels of the DoD with provenance, on demand, since their massive shift to Semantic Web Technology began, circa 2006-2012 -- on which they've been gathering real-world deployment and usage insights in the years since.

If I were commissioned to do so, of course.

oldskeptic commented 3 years ago

In the case of both agriculture and pharmaceutical, composition data is limited if not outright proprietary. Most of the hardcore detail is in the (sometimes mandated) lab reporting tests for basic elements (Pb) or complex compounds like N-nitrosodimethylamine (NDMA) in ranitidine.

But the thing being tested for is highly application specific. Both Pb and NDMA are in wikipedia but process-specific industrial compounds may not be, so I would go for flexibility.

-rhw

On Apr 30, 2021, at 4:03 PM, Orie Steele @.***> wrote:

.... Chemistry is an interesting question... should oil and gas and agriculture use the same chemical ontologies or different ones?

Does it matter that I could import trace metals in a pharmaceutical product?

IMO its a good thing that things feel a bit crammed, and we have a pressure to reuse and generalize as much as possible.... the growth rate for split context approach would be higher (less barrier to add) and quickly lead to duplicate work.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

OR13 commented 3 years ago

@mkhraisha can you add your examples, for how you represent chemical properties?

And any comments on deltas or not?

Do we have everything we need for "chemical properties" today, can this issue be closed?

OR13 commented 3 years ago

Discussed on the call, good conversations regarding the importance of consistent ways of representing properties.

mkhraisha commented 3 years ago

The primary action is: are we happy with representing chemical properties do we really need something different?

OR13 commented 3 years ago

discussed on the call, we will close unless their are objections, we thing any subsequent issues should be addressed seperatly.

w3c-ccg / traceability-vocab

Tracing Chemical Properties #3