Closed baskaufs closed 1 year ago
Post it here before moving to DwC discussion, please, @stanblum. I’d like to know if it comes closer to our existing model or drifts away.
I am in favor of the term as given in the original proposal. For me it defines the box we need for "things". The "Human Resources" argument is the best I've heard if someone should oppose the term for the ethics of using "resources" in reference to humans.
I can certainly bring this before @tdwg/material-sample but I also see no reason they couldn't just comment here. Participation in meetings has dropped off and I don't want to have this decision be made by three or five people. I will email the task group members.
While it sometimes may seem frivolous, edge cases are most helpful in challenging our means for information exchange and sharpening our thinking. It's a good thing to continue to bring them up.
Agreed! Indeed, it's the edge cases I always go to first to finesse where the boundary is. But there is often a point of diminishing returns where further scrutiny of rare edge cases impedes progress more than it enhances clarity and precision. In the age-old battle between the "perfect" and the "good enough" (which are often mutual enemies), I am often rooting for the "perfect" more than most people. But I also understand that this can be counterproductive.
From the two main points of feedback (I think):
Key feature (a reveal to me; duh!) is that an Occurrence is then the intersection (association entity; M:M) between Event and Organism.
[Revised to correct foreign keys in Token subclasses to be OccurrenceID instead of OrgansimID.]
Which looks almost the same as @jbstatgen 's first diagram, except for the use of arrows, I think.
Only modification I'd make is that the Token box should also have an arrow pointing to Identification. Also, I'm a little fuzzy in my own mind about the exact relationship between Organism and Token. E.g., must it always pass through (at least) one Occurrence? So far, that's how we do it, and haven't found a need to change it. But does require expanding the scope of "Occurrence" to things like subsampling in a lab, or photo sessions not related to the time and place of the organism in nature.
Ok, and like @Jegelewicz, I am likewise in favor of the term as given in the original proposal (to stay on topic...)
My preference would be to name the new class term simply Material
.
And make a new property term for materialType
.
And move to a controlled materialType
vocabulary: FossilSpecimen
, LivingSpecimen
, PreservedSpecimen
, and MaterialSample
(the latter for tissue and environment samples, etc. as today).
Would we in such a case want to rename materialSampleID
to materialID
?
I do prefer an explicit strong link to bfo:MaterialEntity
, and I do not mind a link to dcterms:PhysicalResource
(to be captured in the term comments).
I agree with @deepreef that the "essence" of Organism
we want to capture here is different from the material component of it.
(I also think we do need a new class Evidence
, Token
, or simply Record
(for recorded evidence)! And that it would be within the mandate of our task group to propose this).
I realized that I have misused the dcterms: namespace abbreviation when writing my earlier comments in this issue. I think this has unnecessarily complicated the discussion for which I'm sorry.
I have amended my earlier posts accordingly, especially here. I think this now partly resolves the concerns mentioned by @baskaufs (Apologies for stealing your time).
Nonetheless, I stand by my proposal to formally link to bfo:MaterialEntity
for the same reasons stated earlier. Applying the same argument as in the proposal made in the top comment, entailments shouldn't be an issue because they are not formally declared in the bag of terms approach.
@cboelling Since you are essentially suggesting a counter-proposal, I think it would be helpful if you would create a new issue using the new term template and fill out exactly what you are suggesting as the term's metadata properties. In particular, how would you propose to make the link to the external terms? In the efficacy justifications you can reference this proposal and succinctly summarize your arguments as to why your proposal is an improvement over importing dcterms:PhysicalResource
If you are able to do that, I would like to request the members of the TAG to discuss the two proposals. In the past, if there was a well-known external term that captured the essence of what we wanted, we have imported it in preference to creating our own. In cases where our suggested use of the imported term was different or more specific than the use described by the minting organization, we have used non-normative "Notes" (dcterms:description
in RDF) or normative "Usage" (skos:scopeNote
in RDF) along with non-normative "Examples" (skos:example
in RDF) to clarify. Your proposal is somewhat of a departure from this practice and I would like to hear what the TAG thinks about the approach since it would be setting a new precedent.
@tucotuco Can we add a field in the new term form for English label? It's not there presently and probably should be.
@baskaufs The two issue templates have been updated to include "* Term label (English, not normative): ".
Thanks @tucotuco
Sorry to be late in much of what will follow here - not enough time to keep up consistently. There has been a lot of good discussion. Though there is a lot I would say about lots of comments in this issue, I feel compelled to answer questions specifically addressed to me.
@deepreef from https://github.com/tdwg/dwc/issues/421#issuecomment-1333467017
(@tucotuco said) The class dcmitype:PhysicalObject would clearly not work as a broader term (superclass) for dwc:Organism, as an Organism need not be inanimate.
I'm not sure if @tucotuco meant to suggest that
dwc:Organism
would be treated as a subclass of this new class (whatever the label ends up being), but I would strongly advise agaisnt representing it that way.
I would not suggest any subclassing in Darwin Core itself. This is something we once had with the values of basisOfRecord
as a formal type vocabulary (similar to what Dublin Core has for the dcmitype: terms as a type vocabulary for dcterms:type
) and we abandoned it in favor of deferring any kind of over-arching modeling for a later exercise when we were mature enough technically. We are still not engaged in that exercise in Darwin Core.
@jbstatgen from in https://github.com/tdwg/dwc/issues/421#issuecomment-1331364896:
@tucotuco Following your links and looking at your figure 2 (see above) of the README.md with fresh eyes and the discussion of the past days in the back of my mind, I have several questions.
The starting point for my inquiries is that I would like to understand what you need "Organism" for in the GBIF model to have it as subcategory of
bfo:Entity
? What is the role of the class in the GBIF model? What is it used for?
First, a point of information. Entity
in the Unified Model is currently our own fabrication. It is inspired by bfo:entity
, prov:Entity
, sosa:FeatureOfInterest
and dsw:Token
, but we have no formal ontological declarations of any kind at this point. By experience, doing that in an early stage of modeling constrains one's world view. We are only now beginning to consider the benefits of aligning formally with specific world views.
In your figure 2 you placed "Organism" in the same column as
bfo:Entity
and you also describe it as subcategory tobfo:Entity
in your last post. However, the definition ofdwc:Organism
starts out with "Instances of the dwc:Organism class are intended to facilitate linking one or more dwc:Identification instances to one or more dwc:Occurrence instances. ..." This sounds more like the class being an abstract construct, maybe even "only" a tool and not something "real", tangible. It seems thus quite different frombfo:Entity
In the diagram for version 4.5 of the Unified Model, there is no implied significance of the "columns" of tables. Their arrangement is primarily a practical one to minimize clutter. The meaning is instead captured in the cardinality indicators in the connections between tables. Reading that from the diagram says that Organism
(here we do mean sensu Darwin Core) is a subtype of MaterialEntity
, which is a subtype of Entity
.
That quote above about dwc:Organism
is from the (non-normative) Comments for the term, not from the (normative) Definition. The Comments apply specifically in the Darwin Core context, which is flat with respect to describing Occurrences
(the Event
, the Location
, the evidence, the Organism
, the Identification
, the Taxon
are all part of one wide undifferentiated row). For anything fancier, an extension attached within the confines of the star schema constraint is required.
Just because an Organism
is a MaterialEntity
doesn't mean it can't be more than that. In fact, it must be, or we wouldn't bother using a separate class for it. We believe Organisms
can also be Agents
, for example. In the Unified Model, Organisms
are not just part of a flat Occurrence
with inferred links to other concepts. Some or all of the material remains of an Organism
can also provide the evidence for an Identification. An Organism
is also the entity
/Entity
/featureOfInterest
of an Occurrence
and its material remains can provide the evidence for that (digital evidence via DigitalEntities
can also).
At the same time,
bfo:Entity
is the top category forbfo:Continuant
andbfo:Occurent
. Yet, you place "Occurence" to the side in a new column by itself, suggesting that it is something quite different.
Again, the arrangement in columns is not meant to have special significance and formal alignment with BFO is not established. The significance is in cardinality of the relationships provided. In the Unified Model, an Occurrence
is a subtype of Event
. It's a special kind of Event
in which there was evidence of an Organism
having been within a Location
during some period of time.
Finally, I have become a fan of the PROV model, since I like to see the DES and the future of data modeling as fundamentally transactional with an event-based data model at their heart. The colored boxes denote the three foundational elements of the PROV standard: entities, agents and activities. In PROV they are all directly connected in a kind of triangle. In the GBIF model you are placing "Occurrences" inbetween the "Entities" and the "Activities". I don't understand why you need them there. I would appreciate it if you could explain and maybe provide an example.
In the Unified Model, as a subtype of Event
, an Occurrence
is a prov:Activity
where the prov:Entity
is the Organism
and there are various possibilities of prov:Agents
associated with that connection, such as observer, collector, photographer, etc.
For me, reality at some place and time results in an occurrence. In our perception we might focus on the rock facies component of reality (gneiss and basalt as igneous rocks, not sandstone) and not the organism(s) growing on its surface (eg. algae and lichens). The rocks and organisms can be classified according to some relationship/similarity/ancestry scheme. Up to now, the rock facies and organisms are abstract, general (universal? there is an expression for this) concepts. Once we move in an activity/event from the concepts to the specific instances, we have an empirical fact to collect, preserve and share, an entity.
Thus, I would at this point argue that
dwc:Organisms
are of a different "quality" thanbfo:Entity
, though will be happy to better understand your perspective on this.
I hope my responses help. Let me know if they leave any doubts.
@stanblum from https://github.com/tdwg/dwc/issues/421#issuecomment-1333467017:
I have to apologize, too. I modified my previous diagram a bit to accommodate some of the comments, but I hesitate to post it here and continue this thread about Occurrence, Event, Token, etc., because it's essentially peripheral or even irrelevant to the primary issue here -- the proposal to import(?) dcterms:PhysicalResource. @tucotuco, should we move these comments Occurrence to a discussion in DwC (not an issue)?
It might have been a great idea to start this conversation in a Github Discussion in this repository, but given that all of this commentary has been transmitted via email to anyone who is watching the Darwin Core issues (with links to specific comments and such), it would be problematic to move them. I think that if we can periodically provide a summary-so-far comment of the stuff directly relevant to the proposal we should be fine staying here in this issue with the diverse connected conversations. If we come to any concrete conclusions about how to save the (modeling) world on related topics, we should probably do our best to make that happen in other appropriate places as well. For the Unified Model stuff, that would be GBIF's Discourse forum.
@baskaufs:
@cboelling Since you are essentially suggesting a counter-proposal, I think it would be helpful if you would create a new issue using the new term template and fill out exactly what you are suggesting as the term's metadata properties. In particular, how would you propose to make the link to the external terms? In the efficacy justifications you can reference this proposal and succinctly summarize your arguments as to why your proposal is an improvement over importing
dcterms:PhysicalResource
In short, I see 3 alternatives:
dcterms:Physicalresource
(this proposal)bfo:MaterialEntity
dwc:MaterialEntity
as a separate resource under control of DwC but terminologically (through the term label) and conceptually (through the (possiby adapted) definition) informally linked to bfo:MaterialEntity
I can do as you suggest but I would like to run this by the Material Sample Task Group. In my opinion, incorporating a top level term for material entities with accompanying metadata and documentation is the key result of this chartered task group, whichever alternative the TG gets behind.
In cases where our suggested use of the imported term was different or more specific than the use described by the minting organization, we have used non-normative "Notes" (dcterms:description in RDF) or normative "Usage" (skos:scopeNote in RDF) along with non-normative "Examples" (skos:example in RDF) to clarify.
These don't seem to be part of the new term form or is there a mapping?
@cboelling The new term template is a little unclear about this, sorry. Usage comments (recommendations regarding content, etc., not normative)
in the template is what ends up in the Notes
field in the Darwin Core List of Terms (and List of Terms documents for other vocabularies) and in the Comments
field in the Darwin Core Quick Reference Guide. It is the dcterms:description
value in the RDF. Darwin Core does not (yet) have a usage
field (skos:scopeNote
) for any terms because few of its terms are imported from other vocabularies. It has been used commonly in Audubon Core, which borrows many terms whose definitions are set outside of TDWG and therefore sometimes needs to provide normative guidance on how these terms should be used in the TDWG context. So "usage comments" in the Darwin Core template does not correspond to the "Usage" field as it appears in the Audubon Core list of terms.
These patterns are historical artifacts and we probably should get our act together to make the terminology more consistent across documents. Hope this helps clarify.
This was discussed at length today in the @tdwg/material-sample meeting. At this time, we plan to review the very detailed proposal made by @cboelling and make a decision on which of the three choices he proposed we prefer as a committee. That meeting is scheduled for January 18. We request that this proposal be held until at least until then. For more information and to join the discussion see https://github.com/tdwg/material-sample/issues/31
Following the discussion at the January 18 metting, I would like to request that this proposal be withdrawn (closed) in favor of the proposal for dwc:MaterialEntity
soon to be submitted by the MaterialSample task group.
New term
dwc:MaterialSample
. Because that term requires an aspect of sampling, it conflates the role of the material (to serve as a sample) and the fundamental type of the resource (that it's a material rather than digital or information resource). This complication has hindered the progress of the group, whose work is now at a critical phase with the need to harmonize its work with that of Latimer Core (currently under review). This proposal would improve the situation by importing into Darwin Core a term from what is probably the most well-known metadata vocabulary: Dublin Core. That term has exactly the scope (material things) that is being considered by the Material Sample Task Group and importing it into Darwin Core would allow the group to clarify its work by describing the kinds of material things, rather than material samples.Proposed attributes of the new term: