tdwg / dwc

Darwin Core standard for sharing of information about biological diversity.
https://dwc.tdwg.org
Creative Commons Attribution 4.0 International
201 stars 70 forks source link

Change term - identifiedBy #492

Open tucotuco opened 10 months ago

tucotuco commented 10 months ago

Term change

Current Term definition: https://dwc.tdwg.org/list/#dwc_identifiedBy

Proposed attributes of the new term version (Please put actual changes to be implemented in bold and ~strikethrough~):

nielsklazenga commented 10 months ago

Can I take this opportunity to also propose a change to the definition? I think a better definition would be something like:

A list (concatenated and separated) of names of people, groups, or organizations who performed the Identification, or all identifications during an Event.

(this is just my first shot at it)

I think that the problem in the old definition that might make it seem that the term cannot be used in the context of an entire Event is that it also redefines what an Identification is. This is not necessary, as Identification is already defined in Darwin Core. I think it is a bad (re-) definition too, just never noticed it before.

tucotuco commented 10 months ago

It would be great if we can get the definition of dwc:Identification cleaned up as well as part of this process. Proposals welcome.

qgroom commented 10 months ago

Examples (not normative): James L. Patton; Theodore Pappenfuss | Robert Macey

This might seem stupid, but it took me a while to figure out that the semi-colon was an example separator and the pipe was a separator in the example. Perhaps it would be best to just show the second example, to avoid confusion by the likes of me.

A list (concatenated and separated) of names of people, groups, or organizations who performed the Identification, or all identifications during an Event.

So this would just be a long list of unique names with no relevance to their order? Even though that would be a bit ugly, I totally understand why that is the best solution for now. However, perhaps it is best to say that there is no meaning to the order of names.

tucotuco commented 10 months ago

I modified how the examples appear in the first comment. Hopefully this will avoid confusion. It does not change the proposal in any way. In human-oriented documents the punctuation renders the distinct examples on separate lines. For example:

image

About order, the data publisher could provide the names in a meaningful order, but the data consumer would not be safe to assume an order for any given record. If there is a show of support for this, additional comments to that effect could be added.

nielsklazenga commented 10 months ago

Ordering of names is a general issue when dealing with groups of people and whatever is done here should also be done for recordedBy, georeferencedBy and measurementDeterminedBy. I think it would be better to get agreement on how to deal with group agents in general and then update all the relevant terms in Darwin Core at the same time (if an update is considered necessary).

cboelling commented 8 months ago

The current definition of dwc:identifiedBy conflates, as I see it, the semantics of the term and how the relation between instances of dwc:organism and instances of persons, groups of persons, and organizations is serialized, i.e. encoded (e.g., as lists of words in spreadsheets).

A definition which avoids this and is referencing the existing DwC terms, from my point of view, could be:

dwc:identifiedBy == A relation that relates an instance of dwc:organism to a person, group of persons, or corporate entity which has performed an instance of dwc:identification on the given instance of dwc:organism.

The usage notes could specify: If the relation between several instances of dwc:organism with an agent(*) or of several agents with an instance of dwc:organism to be expressed in spreadsheet-like documents, individual instances of dwc:organism and of agents must be separated by suitable separator. Such an expression is equivalent to expressing pairwise dwc:identifiedBy relations between the referenced organisms and agents.

(*) used as overarching term for person, group of persons, corporate entity

baskaufs commented 8 months ago

One thing that I would add to @cboelling's definition is that it is possible that the identification might be done by a software agent, which I don't think would fall into "person, group of persons, or corporate entity", unless I'm misunderstanding "corporate entity". I'm not sure what the appropriate way of identifying such a software agent would be, but if we are fixing this term, we probably should work that out.

deepreef commented 8 months ago

I had a similar thought as @baskaufs . I think the term "agent" should be used consistently, and then qualified as needed, as something like "a person, defined group of persons, organization, electronic device, software, or other entity capable of asserting an organismal identification". I favor the more general term "organization" over "coporate entity".

Dare I suggest that there ought to be a DwC-defined term for "Agent" (or perhaps even a Class?). In the old days we used the foaf schema as a general model, but I don't know if that's a "thing", or if there is some other extrnal entity that TDWG land has embraced (apologies for being out of the loop this past year, in case this is something already dealt with).

baskaufs commented 8 months ago

Dublin Core has a class dcterms:Agent, with the definition: "A resource that acts or has the power to act." It seems to me that this is a simple and straightforward definition that is uncluttered with the semantics associated with foaf:Agent (see this). Given that DwC already leans heavily on DC terms, it would make sense to me to refer to the DCMI term if necessary.

deepreef commented 8 months ago

I like it!

nielsklazenga commented 8 months ago

@cboelling 's "definition" does not actually tell us what the term means but merely describes how a defined term fits in a particular data model. Also, as dwc:identifiedBy is a string (or its object is), it does not actually relate anything to anything.

@baskaufs's point about identifications by software is well taken and I agree with @deepreef's solution to the extent that we use 'agent' in the definition and describe what an 'agent' is elsewhere. I see now that @baskaufs and @deepreef already worked it out while I was writing this.