tdwg / dwc-qa

Public question and answer site for discussions about Darwin Core
Apache License 2.0
49 stars 8 forks source link

Agents (People) tables + Authentication via ORCID, from dwchour input form 10/5/2017 9:18:21 #104

Open iDigBioBot opened 6 years ago

iDigBioBot commented 6 years ago

A user submitted this information via the Darwin Core Hour webform: Timestamp: 10/5/2017 9:18:21 Please provide a topic of interest: Agent Activities extension Are you capable of and interested in participating: Yes Who else would you recommend to participate in the presentation: Any collection that currently has an agents table, or desires one, GBIF who desires authentication with ORCID What resources can you point to: https://doi.org/10.3897/tdwgproceedings.1.19829, https://github.com/tdwg/dwc/issues/102, https://github.com/tdwg/dwc/issues/101, https://github.com/gbif/portal16/issues/334 Your name: David P. Shorthouse Your email: davidpshorthouse@gmail.com Your GitHub username: @dshorthouse

debpaul commented 6 years ago

Hello @dshorthouse. Would you be willing to present this at a DwC Hour? or other webinar where you could explain to the community what you are envisioning? Or perhaps as a blog post? Many at TDWG would understand what you are wanting. However, it might be good to share examples with the collections community - showing how this could work and benefit everyone. From this you might get some co-conspirators :-)

dennereed commented 6 years ago

Hi there. Curious if there's been any development on this topic. It is strange that neither dwc nor bco includes an Agent class. This would help disambiguate the instances referenced by dwc:identifiedBy, dwc:recordedBy, as well as higher level entities such as dwc:rightsHolder etc. The dcmi has an Agent class that dwc could inherit from or perhaps draw from foaf. Something that takes advantage of orcid would be great. I notice there is faof:openid and also there is an orcid initiative to link to this, https://members.orcid.org/api/news/testers-wanted-orcid-openid-connect

hollyel commented 5 years ago

I'm curious if this overlaps with the TDWG/RDA Attributions working group at all?

dagendresen commented 5 years ago

Maybe specifications for an Agent role is relevant for the TDWG NCD http://www.tdwg.org/activities/ncd/ - rather than including an Agent role inside Darwin Core...?

debpaul commented 5 years ago

@tdwg/dwc-qa @dennereed @dshorthouse there are several issues here that I can see, that make it complicated. Let's see.

  1. entry-level folks aren't going to know what this topic is about, so they will need a basic introduction.
  2. most (not all) collections mgmt software lumps all collectors in one text box, sometimes using a fixed separator, sometimes not. for example, botanists often collect in a group and all collectors present are listed on the label.
  3. very few collections mgmt software pkgs have a one-to-many way to store each of these multiple collectors related to the single collecting event.
  4. do we have a dwc extension that makes it possible to share one-to-many collectors for a record, not to mention other agents (data entry person, georeferencer, etc)?
  5. we want unique (ORCID-type) identifiers for "agents" but again we have more issues like 5a) no place in the software to store them, 5b) lack of community understanding of the issue, and 5c) not many authority files to use (Harvard Botany List is one).
  6. are we talking at the level of the entire collection (NCD), or at the individual record level (DwC) or both?

So where to begin? @dshorthouse it would be fantastic to have you present on this topic. When do you think you might be ready to do this? We're ready when you are.

dshorthouse commented 5 years ago

Apologies for the delay, back from vacation. @debpaul, you've done an excellent job of expressing the swirl of issues. The logical place in my mind is to establish a TDWG Interest Group (IG) whose goal is to knit these together in a logical way, thus drawing in more voices such that we get this right. Perhaps the easiest, short-term deliverable for this IG is to create an Agent extension to DwC for use in the IPT, which I've puttered on in a very naive way here: https://github.com/dshorthouse/agents_actions. Missing anywhere is a controlled vocabulary for the actions agents take throughout the lifespan of a collections object. This is very specific to our domain as collections experts and one we can get a handle on in short order. See https://github.com/dshorthouse/agents_actions/issues/1 as a first pass. The roles Agents have while executing these actions, while important for administrators, is of peripheral importance to us. That sentiment is echoed by RDA's Metadata Standards for attribution of physical and digital collections stewardship.

Where to start? Not having ever proposed an IG, I'd need someone here who has and is willing to help launch it + help establish what are the task groups & how to best coordinate efforts & deliverables.

debpaul commented 5 years ago

Hi @dshorthouse I believe proposing an IG is easier than a TG. There's a template - maybe @wouteraddink or @stanblum can advise on next steps.

  1. would it be worth it to do a DwC Hour to talk about this issue (the need and getting started)? this way we could introduce the issues and the need, but also include the process (how do we begin to move forward)

  2. @dshorthouse you've been looking for collaborators to work on this with you. Have you found any? maybe a dwc-hour would help bring them out.

jmacklin commented 5 years ago

I know Paul Morris has spent some time thinking about this as well based on the Harvard Botanist list. This topic is also very relevant to GRBio or whoever begins this important initiative again as it is non-funded/curated :-( This resource should surely be underpinned by NCD, an argument that was difficult to have when being conceived without it being ratified, and leading to issues... FYI, the DINA Consortium currently developing a CMS has gathered similar use cases and are considering a module for agents with many of the properties and relations that have been discussed earlier.

I would discourage an Interest Group and think more about a Task Group under a relevant IG or perhaps a joint effort across several IGs...? The tie to the efforts of RDA and attribution are very relevant. We have to get this nailed down to have any real ability to achieve attribution. Paul may also have thoughts about this as the lead of the TAG.

Best, JAmes

On Thu, Jul 12, 2018 at 1:23 PM Debbie Paul notifications@github.com wrote:

Hi @dshorthouse https://github.com/dshorthouse I believe proposing an IG is easier than a TG. There's a template - maybe @wouteraddink https://github.com/wouteraddink or @stanblum https://github.com/stanblum can advise on next steps.

1.

would it be worth it to do a DwC Hour to talk about this issue (the need and getting started)? this way we could introduce the issues and the need, but also include the process (how do we begin to move forward) 2.

@dshorthouse https://github.com/dshorthouse you've been looking for collaborators to work on this with you. Have you found any? maybe a dwc-hour would help bring them out.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tdwg/dwc-qa/issues/104#issuecomment-404587192, or mute the thread https://github.com/notifications/unsubscribe-auth/AD8fbUIFYv4uNS9orkzonZ6c0YxdBt1cks5uF4YRgaJpZM4PvETd .

baskaufs commented 5 years ago

I should mention that I've been struggling with some issues related to this in my attempts to collect the metadata necessary to make all of the old TDWG standards conform to the Standards Documentation Specification. In particular, I wanted to describe all of the authors who contributed to the various standards documents. That included finding a well-known URI for each one, which was a struggle. You can see what I ended up with in this table.

Aside from the URI issue, I also struggled with describing the contributors' roles (lead author, author, editor, etc.) and affiliations. An issue with the affiliations is that their affiliation at the time they contributed has changed over time and it wasn't clear to me how to express that time-sensitive aspect using well-known properties. More struggles with roles can be seen in another table.

Although these examples are related contributions to document creation, they are very analogous to roles related to contributions to recording occurrences or rendering taxonomic determinations, i.e. you may have more than one person involved, one person may have a leading role, institutional affiliations may be important but change over time, etc. I suspect that these similar problems have similar solutions and would be worth considering if a task group works on a vocabulary for agents.

dshorthouse commented 5 years ago

Contributor roles expressed as nouns instead of verbs will pose problems. When expressed as verbs, the scope of the role is immediately apparent and limited to that particular work/item in question. However, the explicit (or implicit) ranks in role might be a casualty & needs additional thought (eg lead author = authored_lead ??; primary collector = collected_lead ?? ). Affiliations are red herrings imho if we're focussed on mechanisms to record identity and attribution. Crossref, DataCite, ORCID & others are working on OrgID & so perhaps we can leave that aspect to others to solve.

baskaufs commented 5 years ago

ORCID has pretty rich metadata for people's work and education histories which can be obtained by requesting application/xml when dereferencing an ORCID ID (e.g. https://orcid.org/0000-0003-3127-2722). The rather complicated XML structure allows for capturing the range of dates over which the status applied, source of the record, etc. So that kind of pattern could be followed for any kind of time-sensitive roles like author, collector, determiner, etc. But what would be the sweet spot between overkill on complexity and not capturing adequate information? Identifying the use cases that are relevant would be important to determining that.

I'm not so concerned about the exact wording used in controlled vocabulary or metadata property terms Optimally, one would mint a URI for the term and then provide a clear definition that's associated with the URI, rather than getting hung up on the exact label itself.

debpaul commented 5 years ago

@dshorthouse will the upcoming webinar by you and @diatomsRcool address some / any of this?

dshorthouse commented 5 years ago

@dshorthouse will the upcoming webinar by you and Anne Thessen address some / any of this?

It might. Anne and I have to scope out what we want to get across in the time we have. What those "Activities" are and how we express them are likely to be discussed.

debpaul commented 5 years ago

Hi @dennereed @baskaufs @jmacklin @dagendresen @hollyel - note that a dwc hour coming up 29 October 2018 addresses needs related to this particular ticket. https://www.idigbio.org/content/darwin-core-hour-attribution-darwin-core-extension-what-would-look

To @dshorthouse I'd like to suggest a dwc hour on bloodhound.shorthouse.net to show / explain the need for / the power of identifiers. Very elegant - what you've set up ;-) and done so that people can easily grasp!

debpaul commented 5 years ago

@mswoodburn and @dkoureas will you be able to join us for this webinar on 29 Oct 2018? https://www.idigbio.org/content/darwin-core-hour-attribution-darwin-core-extension-what-would-look

dkoureas commented 5 years ago

Hi Deb,

Many thanks but unfortunately, 29 Oct is not a good day for me. I chair a session and speak at an IEEE science meeting in Amsterdam.

Kind regards, Dimitris

--

Dr Dimitris Koureas, FLS Programme Director - Department Head International Biodiversity Infrastructures Naturalis Biodiversity Center, P.O. Box 9517, 2300 RA Leiden, NL

Coordinator, Distributed System of Scientific Collections (DiSSCo http://dissco.eu/) Chair, Biodiversity Information Standards Organisation (TDWG http://tdwg.org/) Research Data Alliance (RDA http://www.rd-alliance.org/) - Technical Advisory Board member

ORCID: 0000-0002-4842-6487 | Linkedin: linkedin.com/in/dkoureas Twitter: @DimitrisKoureas | Tel: +31 (0) 71 751 9251

On Mon, 15 Oct 2018 at 19:07, Debbie Paul notifications@github.com wrote:

@mswoodburn https://github.com/mswoodburn and @dkoureas https://github.com/dkoureas will you be able to join us for this webinar on 29 Oct 2018? https://www.idigbio.org/content/darwin-core-hour-attribution-darwin-core-extension-what-would-look

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tdwg/dwc-qa/issues/104#issuecomment-429935389, or mute the thread https://github.com/notifications/unsubscribe-auth/ALmnQFKobGMmilkCCMHMhGE5QCQoituGks5ulMDNgaJpZM4PvETd .