tdwg / material-sample

A Task Group of the Observations and Specimen Records (OSR) Interest Group
2 stars 0 forks source link

Adding a top-level term for material resources in Darwin Core - evaluating the alternatives #31

Closed cboelling closed 6 months ago

cboelling commented 1 year ago

From the discussions in the Material Sample Task Group and [the outcomes of the Working Session on November 7 (see the documentation in the Task Group's Google Drive) it seems likely that adding a top-level term for material resources, i.e. entities that, as a defining feature, consist in part or whole of matter without any further restriction as to the nature of their physical structure or composition will be among the recommendations of the Task Group for implementation in Darwin Core.

Indeed, a concrete proposal has been made prompted by, as I understand, the Task Group's work. The proposal has spawned a broader discussion exposing different views on how such a top-level term should be implemented.

This post evaluates three alternatives for adding such a term to Darwin Core based on elements from existing knowledge representation schemes. This evaluation is offered as a means to build consensus within the Task Group with regard to how the top-level term should be implemented and to formulate a corresponding proposal endorsed by the Task Group.

Purpose of the top-level term

The purpose of the top-level term is to provide a means for representation of material resources in scope for Darwin Core at the most general level, accommodating material resources for which the elements dwc:MaterialSample, dwc:PreservedSpecimen, dwc:LivingSpecimen, dwc:FossilSpecimen, based on their current definitions or future refinements, are too restrictive. It is expected that the new top-level term will, in the way Darwin Core elements are structured, be a class of its own with the mentioned elements being organized in it - which, informally, repositions them as subclasses of the new top-level term, i.e. any given instance of dwc:MaterialSample (or of any other of the mentioned elements) will be an instance of the new top-level element, when these are seen as classes or types of things.

The alternatives

Three alternatives for implementing the top level term will be discussed here:

dcterms:PhysicalResource

dcterms:PhysicalResource is a term from a collection of terms DCMI Metadata Terms created and maintained by the Dublin Core Metadata Initiative (DCMI).

Except for the basic data in the DCMI Metadata Terms specification there is virtually no documentation on this element. The DCMI user guide mentions the term in one place, when clarifying the application of the property dcterms:medium: The domain of dcterms:medium is the class dcterms:PhysicalResource. This description is actually deviating from the actual specification of dcterms:medium where dcterms:PhysicalResourceis described as being included in the range of this property (this is formalized in the RDFS version of the specification by asserting that dcterms:medium dcam:domainIncludes dcterms:PhysicalResource with dcam:domainIncludes being defined as A suggested class for subjects of this property.)

The example given in the user guide, and the definitions given for the terms involved, suggest that the introduction of dcterms:PhysicalResourcewas motivated by the ability to refer to instances of physical artifacts e.g., a particular oil painting or a particular sculpture, as instances of dcterms:PhysicalResource and to represent their principal physical composition (e.g., oil on canvas, sandstone). The term itself was defined at a general level ("A material thing"). This author was unable to confirm existing applications of the term.

bfo:MaterialEntity

Note that "bfo:MaterialEntity" is used as a pseudo-CURIE for easy reference to the BFO term http://purl.obolibrary.org/obo/BFO_0000040 - see the comments of @baskaufs and @dr-shorthair below.

bfo:MaterialEntity is an element from the Basic Formal Ontology (BFO). BFO is being used as top-level ontology (i.e. featuring foundational classes and properties with cross-domain relevance) in a considerable number of more specialized domain ontologies being developed as knowledge representation schemes, especially in the biological and biomedical domain. Based on Bioportal, bfo:MaterialEntity is cross referenced in a number of other ontologies.

There is extensive documentation on BFO, see the information on the website and the GitHub repo. bfo:MaterialEntity as a core element in BFO is unlikely to change in the future.

dwc:MaterialEntity

A third alternative is to create a new top-level term in the Darwin Core namespace, rather than importing a term from outside. While using the same label as bfo:MaterialEntity, dwc:MaterialEntity would be formally independent from the former. This could be desirable when there are remaining doubts about the scope, applicability or governance of the external term and would establish full control over the term to TDWG - including an own, self-sufficient definition (without references to other BFO concepts - a suggestion has been made here) and the possibility to declare the relation to outside terms, e.g. making dwc:MaterialEntity a subclass or indeed an quivalent class to the outside terms.

Proposal for a definition (first proposed here):

A dwc:MaterialEntity is an entity that persists through some period of time, maintaining its identity, and has some portion of matter as a part at any given moment of its existence.

This is basically a mashup of the definitions used in BFO, avoiding their technicalities.

Weighing the alternatives

BFO was intentionally developed as a top-level ontology to enable high-level connections among different, domain-specific knowledge representation schemes. The fact that BFO is re-used in a number of long-running, community-based ontology projects is evidence that BFO can indeed be helpful to structure knowledge representations on a high level. While BFO certainly has limitations it offers, in this author's opinion, in comparison to the DCMI vocabulary a conceptually coherent structure to connect domain-specific knowledge representations (depending on the formalization of those schemes, this connection can be established on different levels).

The DCMI Metadata Terms, on the other hand, seem to address a much narrower set of use cases. The original set of Dublin Core terms has received widespread adoption to describe essential aspects of cultural language resources with respect to authorship, terms of use, digital encoding, among others (note that these usually aren't of a material kind - for example, Darwin's On the origin of species is an intellectual work (a sophisticated set of ideas expressed in the English language) that can be reproduced and distributed in a number of ways, materiality only entering on the level of an individual copy as a printed book or file on a given storage medium). The expanded set of DCMI Metadata Terms also addresses resources which are material, namely by introducing dcterms:PhysicalResource. The corresponding examples in the DCMI user guide are geared towards material cultural artifacts (e.g., the painting named The Mona Lisa that can be seen hanging in the Louvre), but based on the definition it seems reasonable to assume that any "material thing" falls into the scope of dcterms:PhysicalResource. However, due to the lack of documentation in DCMI it's hard to say what this actually does or doesn't comprise. For example, in paleontological research, endocasts are a set of methodologies to reconstruct aspects of morphology through analysis of cavities of a fossilized specimen. These cavities aren't material entities - they are demarcated by them. In BFO there is a strategy to represent them and relate them to material entities. In the DCMI resources, there is no indication that it is applicable in such cases.

Within the scope of Darwin Core which is organized as a "bag of terms" the isolated import of either of the terms (dcterms:PhysicalResource or bfo:MaterialEntity) will probably be adequate to establish a top-level term for physical resources at a general level. However, due to its adoption in the biological and biomedical ontologies field and, better documentation and conceptual coherence, connecting to BFO by importing bfo:MaterialEntity might strategically be the better choice as it opens up prospects for Darwin Core to connect to the descriptive resources offered in computational ontology.

The third option discussed here is to create a new term in the Darwin Core namespace that is inspired by the other terms (see above). While some would argue that this is an unwelcome duplication, doing so is a perfectly valid strategy, especially if doubts about the applicability of the foreign term or its governance remain - in this case full control is in the hands of TDWG and the relation to elements in other namespaces can be asserted as is appropriate - formally in RDF or OWL or informally through usage and scope notes.

Jegelewicz commented 1 year ago

@cboelling Thank you for providing this detailed analysis! I think it is just what we needed at this time. Before I add my ten cents, I'd like to let others weigh in. I plan to be available during our regular meeting times today if anyone wants to show up and discuss this.

Jegelewicz commented 1 year ago

From @stanblum [edited by Stan]

I appreciate @cboelling calling out "purpose", but I think "representation" needs elaboration.

My understanding is that BFO's overall purpose is to support (automated/machine) reasoning over different datasets or data schemas. [ad hoc integration].

Dublin Core's overall purpose is the basic information science functions of metadata: discovery/search, assessment, and access (including rights and technical protocols and formats).

TDWG's purpose is to support biodiversity science, and our dominant use cases concern publishing datasets (data resources), aggregating them into super resources that can be queried as a whole (e.g., GBIF, OBIS, ALA, etc.), and making it easy: 1) to extract subsets to analyze (e.g., species distribution modeling); and 2) to find the physical specimens needed for further research (GBIF, iDigBio, GGBN).

The GBIF data model is trying to support more kinds of data for biodiversity science, beyond specimens and simple observations, so our "solution" for clarifying/revising "MaterialSample" should (must) support those use-cases and provide clear guidance about how original data resources should be published for aggregators. It should be expressed in language that feels natural to our community.

The purposes above are not necessarily in conflict; e.g., if our solution/revision is compatible with the BFO class hierarchy we create a bridge between biodiversity and the larger world of biomedical data.

Jegelewicz commented 1 year ago

From Jutta

Would there be confusion between bfo:MaterialEntity and dwc:MaterialEntity?

@cboelling says that is a signal that these are potentially the same, and yes, it could be confusing to those not using the pre : code. But the expectation would be that those using DwC would use the DwC terms.

@stanblum says most people won't be using the top level term - they will use a leaf node. Also, what is this contrasted with? for us - it is information resource (not the negative space as in the bfo). But also note that it will be relatively common that what we have treated up to now as an "occurrence record" can represent two or three kinds things; e.g., PreservedSpecimen, TissueSample, Photograph.

Jutta - but there is a structure, we just don't formally document it. (As Stan noted above).

Also, when is something a class and when its it a property?

@cboelling explains: RDF - a class can have instances (humans, tigers, planets, rivers, books, etc.), a property can be shared by instances of a class or classes themselves. [To clarify: @cboelling said properties create links between instances. -SBm]

For me this is unclear - color seems like a property, but there are definitely instances of color (orange, purple, etc.)

Me But creating this term without it's opposite makes it difficult. If it isn't dwc:MaterialEntity, what is it?

@stanblum says we need to make sure we delineate from organism and Jutta says let's first get a MaterialEntity DONE. And I kinda agree. What do we care about more? That there is something we can use to independently verify assertions or what?

cboelling commented 1 year ago

Form @stanblum

Does BFO overall purpose agree with TDWG? BFO is about automated reasoning.

The possibility of automatic reasoning is a consequence of the formal languages (OWL, RDFS) that are used to encode these representational schemes. In use cases where this aspect is irrelevant, the terminology by itself (i.e. the elements of the ontology with their IDs, labels, and definitions) can still be used very much like a controlled vocabulary. These application aspects are complementary, they do not, IMO, are in conflict with each other.

Jegelewicz commented 1 year ago

@cboelling

We need to separate stuff and tackle one thing at a time.

Me First step - let's pick one of @cboelling choices and get behind it.

Jutta We need to announce a decision date let's do the next meeting. Teresa to email everyone and @cboelling will monitor this thread and answer questions.

baskaufs commented 1 year ago

Can someone please look up the governance procedures for BFO and link or summarize them here? I think that is important information that we are lacking. Based on our experience trying to get terms added to DCMI, we are pretty familiar with their procedures.

cboelling commented 1 year ago

Can someone please look up the governance procedures for BFO and link or summarize them here? I think that is important information that we are lacking. Based on our experience trying to get terms added to DCMI, we are pretty familiar with their procedures.

The following is available in the BFO GitHub repository wiki:

https://github.com/BFO-ontology/BFO/wiki/How-we-record-issues-and-resolutions https://github.com/BFO-ontology/BFO/wiki/Proposal-for-how-we-manage-OWL-Reference-coordination

baskaufs commented 1 year ago

Thanks, @cboelling

smrgeoinfo commented 1 year ago

I'd vote for option 3-- dwc:MaterialEntity

tucotuco commented 1 year ago

I agree with the reasoning of @smrgeoinfo and prefer option 3. Thank you @cboelling for laying these options out so clearly.

dagendresen commented 1 year ago

The bridge to the biomedical world makes me lean towards option 2 (importing BFO_0000040).

baskaufs commented 1 year ago

@smrgeoinfo What exactly do you find confusing about the definition of dcterms:PhysicalResource ? I also don't get what the problem is with ranges and domains. It seems to me irrelevant how DCMI describes dcterms:medium since we are not proposing importing that term into Darwin Core.

deepreef commented 1 year ago

First, MANY thanks to @cboelling for the clear/thorough explanation of options. Also to @Jegelewicz for capturing comments from others and aggregating them here.

I see pros & cons for all three options, and it's telling that each seems to be supported by at least one person so far (if I correctly interpret @baskaufs as supporting Option 1).

I guess I would say that the problem with Option 1 is not that it's confusing, but rather that it appears to be very "thin". It certainly has relevance to what we are trying to represent, but I think the main reason to adopt terms from other standards is to inherit a robust definition/use-case/etc. already in play that meets our needs. I gather that's not reall ythe case for dcterms:PhysicalResource.

I guess I slightly favor option 3, for the reasons mentioned by @smrgeoinfo -- but I don't feel sufficiently qualified in the subtleties/implications to have a strong opinion about that.

But I am curious: what would become of dwc:MaterialSample? Would it be deprecated in favor of this new term, or would it be treated as a subclass of the new term (i.e., the subset of material resource instances that are the result of a sampling event)?

I agree with @stanblum that within DwC-land, we will eventually need clarity on where dwc:Organism falls in this realm (e.g., as a subclass of the new Material Entity term, or as its own separate thing)? But I also agree that discussion can come later, in the context of a future task group focused on dwc:Organism.

jbstatgen commented 1 year ago

This is a rationale that W3C PROV-O provides for their choice and decision:

Relationship to other vocabularies

In general, the working group decided not to adopt existing vocabularies directly, in preference for a mappings based approach, so they could be absolutely precise about their semantics.

see the PROV-FAQ at https://www.w3.org/2001/sw/wiki/PROV-FAQ

baskaufs commented 1 year ago

A technical note: the CURIE for BFO material sample would not be bfo:MaterialEntity. It would be bfo:0000040, where the prefix bfo: is defined as http://purl.obolibrary.org/obo/BFO_ and the local name is 0000040. There is no such thing as bfo:MaterialEntity. material entity is the label for bfo:0000040.

dr-shorthair commented 1 year ago

The URI is http://purl.obolibrary.org/obo/BFO_0000040 Assuming you adopt the usual prefix obo: for http://purl.obolibrary.org/obo/ then the cURI would be obo:BFO_0000040

dr-shorthair commented 1 year ago

@deepreef asked:

But I am curious: what would become of dwc:MaterialSample?

I suggest that 'sample' is a role, or relationship. Entities can be samples-of other entities. If the primary entity is a material thing, then the sample will often (usually?) also be a material thing. If every MaterialSample is a MaterialEntity then it is a sub-class, but in order for an individual entity to be a sample, then there should be some notion of what other entity it is a sample of.

Specimen is also a role relating to curation - the thing has been accessioned into a collection.

deepreef commented 1 year ago

Dontcha hate it when you carefully craft a multi-paragraph, deeply insightful, highly arm-wavey post to GitHub...then close the browser before clicking the "Comment" button? Yeah...me too.

So to save everyone the arm-waving, I'll summarize the text that I just lost by saying: 1) I agree that "sample" is a role related to an event, not an intrinsic property of a material thing that has been sampled from another material thing. 2) Beacuse of this, I think that any effort to maintain a MaterialSample class that is a subclass of a new MaterialEntity would create more problems than it solves.

The paragraphs of lofty language I wrote in my previous attempt to reply essentially represented a robust rationale for treating "MaterialEntity" as a replacement term for dwc:MaterialSample, rather than a new, broader-scope "parent" class. Keeping them both maintains the same level of confusion and uncertaintly for data managers, and probably adds even more layers of confusion and uncertainty.

So... my vote is to tweak both the term and the definition to eliminate the "Sample" aspect, and just encourage the community to embrace the new term instead of MaterialSample.

Now... where's that Comment button.... [found it!]

Jegelewicz commented 1 year ago

From TDWG TAG meeting notes on Best practices for borrowing terms from non-TDWG vocabularies.

Can we come up with a short set of conditions under which borrowing is recommended over minting? Can we ask MaterialSample to look at the problem this way?

TAG members who have a long view of how this issue has played out in the past should attend the MaterialSample meeting and contribute to the discussion.

Jegelewicz commented 1 year ago

Also from TDWG TAG meeting notes on Best practices for borrowing terms from non-TDWG vocabularies. Specifically related to BFO terms:

In one TAG member's opinion, BFO has terrible labels (ed: local names) that are counter-productive for humans that use them (e.g. for mapping in IPT). He tried using BFO for agent roles in an AgentActions extension to DwC-A and it did not work well.

A second TAG member noted that on the IPT, the problem could be overcome by using labels rather than the term's short name (i.e. local name). Thus issues with the IPT shouldn't necessarily form the basis of a technical decision.

Response: Fair enough, but are there examples in our community where mapping is done prior to publishing data where labels are used as opposed to a short name? I’m not aware of one. In any case, the technical recommendation here will incur a significant development hit across the board.

dshorthouse commented 1 year ago

The paragraphs of lofty language I wrote in my previous attempt to reply essentially represented a robust rationale for treating "MaterialEntity" as a replacement term for dwc:MaterialSample, rather than a new, broader-scope "parent" class. Keeping them both maintains the same level of confusion and uncertaintly for data managers, and probably adds even more layers of confusion and uncertainty.

At the risk of throwing spaghetti at the wall to see what does or does not stick, I've been quietly perplexed by the term MaterialEntity. It looks like a tautological obfuscation. What is wrong with plain Material? Stated another way, what exactly does the "Entity" part of MaterialEntity convey that "Material" alone does not? I get the impression that it's meant to convey some form of cohesion, commonality, theme, or purpose if that "material" consists of many distinct physical objects. At the very least, I hope we acknowledge that there's some confusion introduced about what is an "entity". Can an instance of an entity be counted? Is it divisible? Is water in a vial or air in can each a MaterialEntity or is what manifests their existences & identities merely the vessels in which they are stored? I'd still be inclined to call these "material" should they escape (a stretch of the imagination to be sure), but I'd no longer call them "entities".

And so, the definition...if "Material" alone is sufficient as a class-level, new term:

A dwc:Material is an entity that persists through some period of time, maintaining its identity, and has some portion of matter as a part at any given moment of its existence.

UPDATE:

You may wish to ignore the above. Upon looking in a dictionary, there are certainly cases where "Material" has no physical manifestation. Examples: music, a comedian's repertoire, the collection of knowledge used in course work, etc. Sigh.

tucotuco commented 1 year ago

I think the more important reason for Material[Something] is that material alone is also an adjective, thus capable of obfuscating in a way that Material[Something] does not.

"MaterialEntity" was borrowed from BFO, where all of the existential issues you raise above are treated.

dshorthouse commented 1 year ago

I think the more important reason for Material[Something] is that material alone is also an adjective, thus capable of obfuscating in a way that Material[Something] does not.

Fair enough, though BFO is "materially" broader in scope than DwC would ever be. Hard to imagine a case where a Class term such as "Material" in the context of a biodiversity standard could ever be interpreted as an adjective. However, I could see it used to represent a collection of field notes – a biologist's unpublished "material" (despite it having physical manifestation and used as a noun) – whereas we'd perhaps be less inclined to have such a category for instances if "Entity" were slapped on the end.

cboelling commented 1 year ago

In this post I will try to provide an update regarding the issues raised above in view of the upcoming Task Group meeting.

CURIE / namespace names for MaterialEntity

In addition to what @baskaufs and @dr-shorthair correctly describe w.r.t. the CURIE / URI of the BFO term, I would like to add that the BioRegistry specifies the namespace name "bfo" and a pattern for creating URIs for BFO elements.

I adapted the original post above to clarify that the denomination "bfo:MaterialEntity" is used here as a pseudo-CURIE to ease our current conversation.

While the TDWG design patterns distinguish between local names and labels and, as I understand, grant the flexibility to add arbitrary labels to a given term (with a fixed local name) there is a standard pattern of term local names and labels in DwC as @baskaufs has described here. Importing bfo:MaterialEntity would likely require breaking this pattern by adding a label that does not correspond to the local name. If we should specify a custom combination of namespace name and local name to reconstitute the original term URI is yet another question - probably this isn't advisable (though it would probably be permissible).

Relation of dwc:MaterialSample to the top-level term

I side with the notion that dwc:MaterialSample, when used to denote a class of objects, can and should be treated as a subclass of the new top-level term (when that is used to denote a class - which it usually will be). This means that any particular item in the real world that is a Material Sample also is a Material Entity / Physical Resource, but not the other way round. It is a safe conclusion to make, because of the generality of the new top-level term. The new top level term is intended as a catch-all category for physical / material artifacts about which we want to store information using the Darwin Core vocabulary. As the notion of sample clearly is important for a number of use cases I think that it is likely that the notion of dwc:MaterialSample is worth to be maintained - possibly with an update and better definition of what it is that distinguishes a sample from other physical entities in scope for Darwin Core (possibly implying roles to describe this). As far as I could gather from the DwC documentation, there is no formal or standardized way to directly assert or employ subclass relationships when using Darwin Core (neither in plain text nor in XML-based data representations). Instead, a property dwc:XYZType can be used with a set of literal values that acts as standardized vocabulary to describe subclasses. So in this case, as was also discussed in the Task Group, a property dwcMaterialEntityType could be used with the literal values "MaterialSample", "LivingSpecimen", "FossilSpecimen", etc. to indicate the specific kind of Material Entity / Physical Resource, as far as DwC distinguishes them. This list could be expanded over time.

Usability of labels

It is a design feature of many knowledge representation schemes that they define one or more labels for the same element which is identified by a persistent and unique identifier. This has a number of advantages for computational use of the data, among those accommodating multiple languages. In evaluating the options for adding a high-level term the features of the standard vocabulary should be seen separate from the features of downstream applications using the vocabulary. It is difficult from the minutes of the TAG meeting to understand what limitations are there when using BFO terms (or terms from other ontologies) out of the box with current design patterns in Darwin Core and current features of downstream applications and wether limitations are best overcome by adapting one or the other. In general, the design of having one identifier and potentially many associated labels provides a lot of useful flexibility which downstream applications can and probably should use to their advantage.

Material vs. MaterialEntity

I assume that the terminological choice of "MaterialEntity" is a consequence of preferring class labels that indicate how an individual instance of the class could be identified and separated from others as a unit. particular instance class
Charles Darwin Person
Katrina Hurrican
this portion of water portion of water

In the last example, the class label "water" under this mindset would be less insctructive with respect to referring to particular identifiable instances as units. Compare: 1, 2, 10 water/ this water/ that water 1, 2, 10 portions of water ( this portion of water / that portion of water

Classes in BFO are intended as representations of universals which have particular instances. Using singular countable nouns as class labels is intended to facilitate creating and transmitting corresponding statements (i.e. relating particular instances to classes, specifying relations between classes (a cat is a mammal)) using the ontology. At that, nouns are preferred which ordinarily are used to denote individual objects rather than qualities. Even if the word "material" is used as a (countable) noun it still conveys more of the qualitative aspect of an object ("the three materials used to build the nest where leafs, feathers and twigs").

So the label "MaterialEntity" seems to correspond to the English language element "material entity" where "material" is used as an adjective and which lends itself to refer to one, two, many instances of material entity as identifiable units.

The particular collection of water molecules in a vial is an instance of bfo:MaterialEntity as is the assembly of the that portion of water together with the containing vial. The former is a Material Entity in its own right - but without the vessel anyone would be hard-pressed to identify or handle it in a productive way. The water in that particular vial with number A-275 is identifable due to its containment in the vessel and the effort taken to keep the vessel identifiable, and is relevant as it is used as base for a buffer for a particular PCR assay. Hence one might be interested in keeping track of it in an information system and record data about it. The water that collected as a drop of rain right now at my window sill is an instance of bfo:MaterialEntity alright, but I have no further interest in it, no way to track its whereabouts a moment later or establish its properties.

baskaufs commented 1 year ago

Thanks, @cboelling for your summary. Here are a few notes about current rules/precedents/practices for labels and term names in TDWG. (ALL CAPS represents an RFC 2119 keyword.)

Section 3.3.3.1 of the Standards Documentation Specification says that a label is RECOMMENDED. This section does not say that the RECOMMENDED label must be in English, but that can be inferred from the specification that the standards documents are, in general, to be written in English. In Section 4.5 it is noted that values for rdfs:label SHOULD be English language-tagged plain literals -- they correspond to the label described in section 3.3.3.1 . Values of labels in other languages MAY be provided in ancillary documents outside of the standard.

In practice, the idea is to maintain stability of the English labels (by including them within the standard) with more flexibility since they are usually not declared to be normative (vs. definitions, which are normative and difficult to change). Thus far, this has been handled consistently in List of Terms documents (example) in a "Status of the content of this document" section where labels are declared to be non-normative.

Non-English labels are a work in progress and making them be outside of the standard is to make it easier to create their translations. Some of the controlled vocabularies have relatively good non-English label coverage (example). See this for translations status.

Term names are also defined in Section 3.3.3.1 of the SDS:

Term name (REQUIRED) - The term name is a controlled value that represents the class, property, or concept described by the term definition. The term name is composed of the local name part of the term IRI, with a prepended namespace abbreviation (QName) that is defined in the header section of the vocabulary list document [RECIPES]. The term name is often related to the meaning of the term, but users MUST NOT attempt to understand the meaning of the term by interpreting its name. Rather, the term definition MUST be consulted.

Term names are declared as non-normative in the "Status of the content of this document" section simply because they contain a namespace abbreviation that is understood by common use, but that is not necessarily strictly defined (e.g. the DCMI namespaces are inconsistently abbreviated as dc:, dcterms:, dcterm:, dct:, etc.). However, the local name part of the term name is fixed based on the definition in the SDS: it is the local name part of the term IRI, which is normative given that it is a persistent identifier for the term.

These rules and practices determine what we can and can't do related to "term names" and labels.

baskaufs commented 1 year ago

I did not discuss controlled value strings, which are a feature of controlled vocabularies but not of class and property terms. They are a stable string (usually an English phrase in lowerCamelCase) that can be used in lieu of the term IRI to denote the concept. They differ from labels in that they are normative and that there is a single string that is used by everyone regardless of the language they may prefer to use as a term label. Examples are shown here

smrgeoinfo commented 1 year ago

Here's a UML view of what I understand the conceptual model to be. I hope the notation is clear enough.

image

smrgeoinfo commented 1 year ago

I think one important conceptual aspect it to recognize that 'MaterialEntity' includes both substance and object. MaterialSample is an object.

smrgeoinfo commented 1 year ago

comments on @cboelling discussion above:

Even if the word "material" is used as a (countable) noun it still conveys more of the qualitative aspect of an object ("the three materials used to build the nest where leafs, feathers and twigs").

leaf, feather and twig are objects, not substance. they are countable, and composed of some material (e.g. at a high level 'organic material'). So [nest hasPart {leaf, feather, twig} ], but [nest isComposedOf {organicMaterial, anthropogenicMaterial}] (or something like that.... :) )

smrgeoinfo commented 1 year ago

In the last example, the class label "water" under this mindset would be less insctructive with respect to referring to particular identifiable instances as units. Compare: 1, 2, 10 water/ this water/ that water 1, 2, 10 portions of water ( this portion of water / that portion of water

obviously 1,2,10 water doesn't make sense-- water is substance, not countable 'this water' / 'that water' inherently denotes some portion of water -- that's what 'this' and 'that' mean. a 'portion of water' would appear to denote some object-- a particular quantity of substance that has some boundaries and identity.

cboelling commented 1 year ago

comments on @cboelling discussion above:

Even if the word "material" is used as a (countable) noun it still conveys more of the qualitative aspect of an object ("the three materials used to build the nest where leafs, feathers and twigs").

leaf, feather and twig are objects, not substance. they are countable, and composed of some material (e.g. at a high level 'organic material'). So [nest hasPart {leaf, feather, twig} ], but [nest isComposedOf {organicMaterial, anthropogenicMaterial}] (or something like that.... :) )

The sentence The three materials used to build the nest where leafs, feathers and twigs.

is similar to the sentence The three colors occuring in this bird's plumage are yellow, red and blue.

in that it expresses qualitative aspects of the subject of inquiry (a particular nest) and I think the word "materials" is usually used in this way. This is the reason why "MaterialEntity" might be preferred as label rather than "Material". Of course, the reality is that feathers, leafs and twigs are individual countable objects (but the sentence above doesn't make reference to individual objects, only to the object types in general).

Using @smrgeoinfo's notions of substance and object I see instances of MaterialEntity, just like MaterialSample as objects. What you refer to as substance, e.g. water or air in general, would not be instances of MaterialEntity, as I see it. A given portion of water would be, and there is no limit on the size or complexity of it: this drop of water, the water in this pond, the water in this sample, all the water on planet Earth.

The words "water" or "air" even though they are grammatically nouns are IMO as uncountable nouns signaling qualitative aspects (unless they are used in an elliptical form, standing for a glass of water etc. - which again, would be instances of MaterialEntity).

Qualities have a different place in BFO and I find the view that MaterialEntitieshave Qualitiesquite compelling. "consisting of water" would in this sense be a quality.

smrgeoinfo commented 1 year ago
The sentence
The three materials used to build the nest where leafs, feathers and twigs.

Should read "three kinds of object were used to build the nest: leafs, feathers, and twigs. Leafs are composed of x (one or more substances, substance is a kind of materialEntity), feathers are composed of Y (same...), twigs are composed of Z (same...). "

The issue @cboelling raises is whether materialEntity subsumes substance and object, or is only object. I think the definition proposed in option 3 is consistent with substance or object.

cboelling commented 1 year ago

The current draft for the term request for dwc:MaterialEntity including all incremental contributions ("commits") is always visible for inspection at this URL: https://github.com/tdwg/material-sample/blob/ntr-material-entity/primary_deliverable/materialentity.md

Thank you for the contributions so far - these go nicely together in my view. Please add any further comments regarding the proposal for the New Term Request for dwc:MaterialEntity until tomorrow 1pm UTC to either the dedicated discussion thread for the draft ("the pull request" in git parlance) or here. Unless something controversial comes up, I will then pull the draft into the canonical version of this Task Group's repo. I also understand that the consensus view is to then use the draft and formulate an NTR in Dwc's main repo, which I offer to do right after merging. If you prefer a different course of action, please come forward and let us know also by tomorrow 1pm UTC.

Jegelewicz commented 1 year ago

reopening so to have all review package issues open - https://github.com/tdwg/material-sample/blob/main/review%20package/MaterialEntity.md

cboelling commented 1 year ago

This issue has been re-opened as the companion issue for the draft New Term Request (NTR) for dwc:MaterialEntity as part of the current review package [1]. Please not that as a result of the TG discussion about dwc:MaterialEntity (see the discussion above, #32 and #34) an NTR has already been formulated in the DWC-repo [2]. The definition and other metadata proposed in [2] differ from the draft in the current review package [1].

I believe that the draft NTR in the current review package [1] could be updated by the existing NTR [2]. If you confirm this, @Jegelewicz, I can perform this update - or should the review package simply directly link to [2]?

[1] https://github.com/tdwg/material-sample/blob/main/review%20package/MaterialEntity.md [2] https://github.com/tdwg/dwc/issues/426

Jegelewicz commented 1 year ago

@cboelling thank you for noting this - perhaps the review package should link directly to [2]

Jegelewicz commented 6 months ago

change complete - https://github.com/tdwg/dwc/issues/426