tdwg / dwc

Darwin Core standard for sharing of information about biological diversity.
Creative Commons Attribution 4.0 International
204 stars 70 forks source link

Establish Task Group for MaterialSample #358

Closed tucotuco closed 2 years ago

tucotuco commented 3 years ago

@Jegelewicz has tentatively agreed to convene a task group around the subject matter of MaterialSample as a result of discussions in issue #314. One of the first orders of business is to figure out who will be the charter core members of the task group. These should all be people who are willing to contribute effort to establishing and achieving the goals of the task group. This message can serve as a call for such help. I will certainly participate. Somewhat hand-in-hand with core membership is establishing a reasonable draft of the scope of the task group. This will help people to decide their level of interest and whether they can commit to providing effort. It will also help in determining the Interest Group under which the Task Group should be chartered. The Executive Committee with the guidance of the Technical Architecture Group can help determine this definitively when the charter is presented or before. Ideas for the scope and tasks can be mined from issue #314 and other issues referenced there and under the Task Group - MaterialSample label. This message also serves as a call for help to define the scope. The Task Group is an ephemeral entity. It's existence and duration are meant to accomplish something tangible. The scope can adapt over time as necessary, but should be established with a reasonably achievable set of goals and deliverables for which a timeline can at least be estimated. I would recommend that one ingredient of a successful task group is to delegate known tasks among core members from the outset. That way people who are responsible, for example, for reporting, can have their objective in mind from the outset and keep on top of it rather than having to scramble with the onerous task of trying to pull together information post-facto. The details of task groups from the standards perspective can be found in the TDWG Process By-laws.

deepreef commented 3 years ago

EXCELLENT!!! Thank you, @Jegelewicz !!!!

I am happy to volunteer as a charter core member (if others agree, of course).

baskaufs commented 3 years ago

I would like to have a little clearer picture about the scope of this proposed group. It seems to me that the preliminary name of the group (MaterialSample) implies that the scope of the group's task is narrower than what would probably be necessary to actually address all of the issues raised in #314. In my view, what would really be required is essentially working out the data/graph model near the center of the TDWG universe. That would therefore overlap with issues related to that:

  1. Dealing with the limitations of the star schema of DwC-A.
  2. Other groups who are interested in an overarching model for TDWG (i.e. ABCD)
  3. differing needs/requirements of Linked Data approaches vs. relational database approaches vs. spreadsheet approaches.
  4. Concerns of the museum-centric contingent of TDWG vs. concerns of the human/machine observation contingent.

There are probably others that are not popping into my head at the moment.

That isn't to say that the task isn't important and useful, but if it is really as broad as I'm thinking, then the number of stakeholders who should be involved and the amount of time to finish could potentially be large.

This is a topic that I'm interested in, and therefore would like to participate. However, I have a leadership role in two other TDWG interest/task that are trying to wrap up their work on major initiatives in the next 6 months. So my ability to participate would be somewhat limited during that timeframe.

robgur commented 3 years ago

Steve, Teresa, John, Rich, et al. --- I was involved in the original proposal for a MaterialSample term and have been quietly following the discussions. I agree with Steve about the challenge of the scope of this group, and whether and how to define narrower, achievable goals. Still, that is one remit of the Task Group and if it turns out that all paths require a much deeper dive, that is important information as well. So, I guess I am "volunteering" to be involved somehow, in whatever way seems best. I will remind people (see below for the gory details) that the original intent of the addition of the materialSample terms was to provide a means to represent samples more broadly than conceived in the context of Darwin Core in 2013, when the proposal was made. I do think it is high time to come back to some of these discussions. Best, Rob

New Term Request: Material Sample

This is a proposal for two new terms in Darwin Core, relating to the addition of the concept, “Material Sample”, described by the identifier The two terms are:

1) A new BasisOfRecord term MaterialSample with label “Material Sample” that references

2) A new Darwin Core property term, MaterialSampleID.

Submitters: John Deck, Rob Guralnick and Ramona Walls


The current values in the DwC Type Vocabulary ( do well in representing some types of biocollections and observations. However, the more general notion of a sample is not well represented, because the existing terms are too specific. For example, the DwC terms “Preserved Specimen”, “Fossil Specimen”, and “Living Specimen” are appropriate for use in the museum community but assume particular properties pertaining to museum collections, which “material samples” may or may not have. Examples of “ material samples” we are considering (beyond the examples above) are surveys that involve soil and water sampling, bulk sampling of specimens from, e.g., trawls, microbiological sampling, metagenomics, etc. These sampling approaches often rely on field sub-sampling processes and laboratory techniques (e.g., DNA extraction and sequencing) which transform the physical material and produce distinct information content and thus represent a type of information that is distinct from what DwC has typically dealt with. The proposal for adding “Material Sample” as a DwC class is to maintain consistency with the way Darwin Core terms are managed and organized. This term comes from the Ontology for Biomedical Investigations (OBI) class OBI:specimen. We use the class concept definition directly from OBI but provide the more familiar label “Material Sample” for use within the biodiversity community and annotate how that definition applies in the domain of biological collections.

A “material sample” can pertain to general matter in which organisms may exist, in whole, in part, or in conjunction with many other organisms. The “material sample” may exist for a brief period, such as a tissue that is converted to extracted DNA. It may also represent a collection of multiple taxa, such as a soil or water sample that is used with the intention of describing the diversity of organisms, whether the actual organisms are later recovered from such a sample, or whether that sample is processed in order to generate a set of derivatives from organisms (e.g.16S sequences from a metagenomics run). A “material sample” may also yield connections to other indicators of biodiversity aside from taxa, such as a transcriptome, indicating which DNA is actively being expressed at a particular point in time.

For the purposes of biological collections, we can think of “material sample” as any type of matter that we can use in order derive further evidence needed for identification of taxa, whether it is taxonomically homogenous, heterogenous, a single individual, sets of individuals, or populations. However, the definition of the term does not exclude its use in broader contexts outside the scope of biological collections.

How is the term “Material Sample” different from “Individual”? The intent of individualID is fairly clear: since an Occurrence represents an organism at a place and time, the individualID term allows us to assign an instance identifier for a particular organism that can be present in at multiple events. MaterialSampleID, on the other hand, is intended to allow users to say that the basis of an occurence is a material entity (i.e. matter) that has been sampled according to some particular method. Whether or not this material entity is an individual (sensu individualID in DwC) represents an independent axis of classification. There is no restriction on specifying that an occurence is associated with more than one type, so any occurrence can have both an individualID and a materialSampleID.

Adding this term will help align DwC to two other significant projects: the Ontology for Biomedical Investigations (OBI), from which we will be adapting this term, and the MIxS family of checklists.

The MIxS vocabulary is proposing to adopt MaterialSampleID by clarifying the existing term source_mat_id to read:

“A unique identifier assigned to a material sample (as defined by, and as opposed to a particular digital record of a material sample) used for extracting nucleic acids, and subsequent sequencing. The identifier can refer either to the original material collected or to any derived sub-samples. The INSDC qualifiers /specimen_voucher, /bio_material, or /culture_collection provide additional context and suggested syntax for this identifier for data submitted to INSDC databases.”

The MIxS source_mat_id term clarification proposal is pending based on the outcome of this proposal.

Connecting a DwC Record to a MIxS record would have the advantage of aligning DwC terminology (geospatial, taxonomic) with sequencing terminology (investigation, environment, nucleic acid sequence source, sequencing) and with OBI (investigation, roles, processes), using “Material Sample” as the pivot point between the standards.


From OBI (( “A material entity that has the specimen role.”

A specimen role ( in OBI is defined as “a role borne by a material entity that is gained during a specimen creation process and that can be realized by use of the specimen in an investigation”. The operative word is “can”. That is, the specimen is not required to be realized by use in an investigation. However, it is worth nothing that deposition into a museum or biobank can fulfill the criteria of “use in an investigation”, if necessary (for discussion, see

We have chosen to use the label “Material Sample” instead of using the OBI label “Specimen” for this definition. This allows us to distinguish this term from other types containing the word “Specimen” currently in use in the Darwin Core vocabulary, which have their own meaning, distinct from the concept we are proposing. In the natural history community, biological specimens have a colloquial meaning, typically referring to a voucher held by a biorepository for research. We intend a more inclusive definition, and thus, when we refer to “DwC Material Sample” here, we are actually referring to the class of entities defined by “OBI Specimen”.

In order to clarify how this definition may be considered in a biological collections context, we wish to include a -schema#comment annotation within the DwC vocabulary which would read: “In biological collections, the material sample is typically collected, and either preserved, transformed by some process, or destructively processed” Further clarification on the use of this term, including this document, would be provided in the supplementary documentation and the Darwin Core wiki.

Comment: N/A

Refines: N/A

Has Domain: N/A

Has Range: N/A

Replaces: N/A


Term Name: MaterialSample


Namespace: http:/

Label: Material Sample

Definition: A resource describing the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed.

Comment: For discussion see .com/p/darwincore/wiki/DwCTypeVocabulary (there will be no further documentation here until the term is ratified)

Type of Term:


Status: proposed

Date Issued: 2013-03-28

Date Modified: 2013-05-25

Has Domain:

Has Range:


Version: MaterialSample-2013-06-24




ABCD 2.0.6: not in ABCD (someone please confirm or deny this)

Term Name: materialSampleID



Label: Material Sample ID

Definition: An identifier for the MaterialSample (as opposed to a particular digital record of the material sample). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the materialSampleID globally unique.

Comment: For discussion see .com/p/darwincore/wiki/MaterialSample (this page will not exist until the term is ratified).

Type of Term:


Status: proposed

Date Issued: 2013-03-28

Date Modified: 2013-05-25

Has Domain:

Has Range:

Version: materialSampleID-2013-05-25



Class: ABCD 2.0.6: not in ABCD (someone please confirm or deny this)

On Thu, Jun 10, 2021 at 7:20 AM Steve Baskauf @.***> wrote:

I would like to have a little clearer picture about the scope of this proposed group. It seems to me that the preliminary name of the group (MaterialSample) implies that the scope of the group's task is narrower than what would probably be necessary to actually address all of the issues raised in #314 In my view, what would really be required is essentially working out the data/graph model near the center of the TDWG universe. That would therefore overlap with issues related to that:

  1. Dealing with the limitations of the star schema of DwC-A.
  2. Other groups who are interested in an overarching model for TDWG (i.e. ABCD)
  3. differing needs/requirements of Linked Data approaches vs. relational database approaches vs. spreadsheet approaches.
  4. Concerns of the museum-centric contingent of TDWG vs. concerns of the human/machine observation contingent.

There are probably others that are not popping into my head at the moment.

That isn't to say that the task isn't important and useful, but if it is really as broad as I'm thinking, then the number of stakeholders who should be involved and the amount of time to finish could potentially be large.

This is a topic that I'm interested in, and therefore would like to participate. However, I have a leadership role in two other TDWG interest/task that are trying to wrap up their work on major initiatives in the next 6 months. So my ability to participate would be somewhat limited during that timeframe.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe .

Jegelewicz commented 3 years ago

It seems to me that the preliminary name of the group (MaterialSample) implies that the scope of the group's task is narrower than what would probably be necessary to actually address all of the issues raised in #314. In my view, what would really be required is essentially working out the data/graph model near the center of the TDWG universe.

I agree with this sentiment as well as:

the number of stakeholders who should be involved and the amount of time to finish could potentially be large.

Given that, perhaps a good path to take would be to task ourselves with proposing a DwC schema that addresses the limitations of the star schema of DwC-A while keeping in mind the differing needs/requirements of Linked Data approaches vs. relational database approaches vs. spreadsheet approaches and the concerns of the museum-centric contingent of TDWG vs. concerns of the human/machine observation contingent.

Is this doable or are we setting ourselves up for endless discussion and debate? Is there some smaller task we could take on that would start the ball rolling in the desired direction or should we ask for wholesale upheaval?

In part, I think this would create a revision to the definition of Darwin Core which includes:

Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, samples, and related information.

Because I think a lot of what was discussed in #314 operates under a different understanding. Maybe more like

Darwin Core is primarily based on evidence for taxa, as documented by observations, samples, and related information.


rondlg commented 3 years ago

I'd very much like to be included also.

RogerBurkhalter commented 3 years ago

I would also like to be included in this MaterialSample Task Group.

timrobertson100 commented 3 years ago

Thanks to @Jegelewicz I'd also like to participate, particularly when it comes to issues around the arrangement of data (e.g. star schema limitations)

wouteraddink commented 3 years ago

Hi @Jegelewicz, from the DiSSCo technical team Matt Woodburn is interested to participate for alignment with digital ext specimens infrastructure and the work on TDWG CD.

cboelling commented 3 years ago

Hi Teresa,

I'd like, on behalf of DINA, where we have identified similar issues as were raised in #314 and in related issues, offer to participate in the proposed task group. We agree with @baskaufs that the issue may have implications that extend well beyond dwc:MaterialSample itself and that therefore scope and deliverables of the group need to be concretized.

baskaufs commented 3 years ago

Since it looks like this is going to happen, you can put me down on the core member list. Also, @Jegelewicz, I would be happy to advise you on the technical details of task group formation, requirements, etc. since I've been involved in chartering/running a number of them. Just ping me off list at if you want to set up a call to talk about what it would involve.

gdadade commented 3 years ago

I'd like to be included on behalf of GGBN and CD.

qgroom commented 3 years ago

I hope this come under the Observations & Specimens Interest Group, though as has been said, the scope needs further definition. I'd be very happy to be included.

Jegelewicz commented 3 years ago

It seems like this would also be part of

dr-shorthair commented 3 years ago

@baskaufs has drawn my attention to this proposal. I'm not a regular member of this community, but I was a primary designer of related work in OGC (O&M) and W3C (SSN/SOSA) and also have some vision of what is going on in IGSN (originally geology samples, now being used a bit in some adjacent disciplines). So I think I could contribute here.

Jegelewicz commented 3 years ago

Let's get this party started! I have started a draft charter for this group as a Google Document and sent everyone who expressed an interest an email sharing the folder. If you don't get an invitation, please let me know.

afuchs1 commented 3 years ago

Hi Teresa

I would like to participate in this task group if it is not too late to nominate

Cheers Anne Fuchs

From: Teresa Mayfield-Meyer @.> Reply to: tdwg/dwc @.> Date: Monday, 21 June 2021 at 06:25 To: tdwg/dwc @.> Cc: Subscribed @.> Subject: Re: [tdwg/dwc] Establish Task Group for MaterialSample (#358)

Let's get this party started! I have started a draft charter for this group as a Google Document and sent everyone who expressed an interest an email sharing the folder. If you don't get an invitation, please let me know.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe

Jegelewicz commented 3 years ago

@afuchs1 just sent an invite to the draft charter and a poll for meeting days/times. Welcome!

datadavev commented 3 years ago

@Jegelewicz - I too would very much like to participate in this working group.

Jegelewicz commented 3 years ago

@datadavev can I get your email so that I can share the draft charter with you?

datadavev commented 3 years ago

@datadavev can I get your email so that I can share the draft charter with you?

Yes of course - Thanks

dagendresen commented 3 years ago

I am interested to take part in this MaterialSample task group (

Jegelewicz commented 3 years ago

@dagendresen just sent an invite to the draft charter and a poll for meeting days/times. Welcome!

ghwhitbread commented 3 years ago

@Jegelewicz Following @baskaufs logic I would also like to join, at least until the direction is clear. (email:

Jegelewicz commented 3 years ago

@ghwhitbread just sent an invite to the draft charter and a poll for meeting days/times. Welcome!

smrgeoinfo commented 3 years ago

I'd be interested in participating in the task group; my interest would be alignment with wider concept of physical sample we're working on in the iSamples project.

Jegelewicz commented 3 years ago

@smrgeoinfo just sent an invite to the draft charter and a poll for meeting days/times. Welcome!

Jegelewicz commented 3 years ago

First meeting is set for July 21, 2021 at 10AM MDT. Invitation sent to all members today.

dr-shorthair commented 3 years ago

Unfortunately 2am local time for me 👎

ghwhitbread commented 3 years ago

2am! Is this becoming a TDWG thing?


On Fri, 25 Jun 2021 at 07:01, Teresa Mayfield-Meyer < @.***> wrote:

First meeting is set for July 21, 2021 at 10AM MDT. Invitation sent to all members today.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe .

-- Greg Whitbread +61 418 670 368


Jegelewicz commented 3 years ago

Apologies, but that was when the most people could attend. If this group goes forward, I think we will need to alternate monthly meetings so that we can schedule reasonable times for people globally with overlap. The scheduling might just be the most difficult part of the whole thing. I will look over the weekend to see if we can schedule a second "initial" meeting that will include those down under!

deepreef commented 3 years ago

MDT = UTC−06:00? So... that would be 16:00 (4pm) UTC?

BTW, a couple of other international meetings that I regularly attend use a system of offsetting the time by 6 hours each consecutive meeting. So, if the first meeting is at 16:00 UTC, the next would be 22:00 UTC, then 04:00 UTC, then 10:00 UTC, then loops back around.

I think that's the best that we in the Pacific/Aus/Asia part of the world can hope for.

dr-shorthair commented 3 years ago

that was when the most people could attend.

Yes, of course. That's democracy.

I love Doodle but it doesn't give visibility of "absolutely impossible" :-D

Jegelewicz commented 3 years ago

Looking at the poll results, we can get @albenson-usgs, @datadavev , @ghwhitbread and @dr-shorthair together at 4PM MDT. I would be happy to meet up with you all then and try to drag in at least one other person who attends the first meeting. Would this work?

rondlg commented 3 years ago

I can be at both if that helps.

tucotuco commented 3 years ago

As can I.

On Fri, Jun 25, 2021 at 11:52 AM Sharon Grant @.***> wrote:

I can be at both if that helps.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe .

deepreef commented 3 years ago

Assuming that's 22:00 UTC, that certainly works for me.

cboelling commented 3 years ago

See here for time zone comparison of the last proposal.

While I'd be generally not available after 7 pm Berlin time, for the first meeting I would be able to meet up until 4 pm MDT (end of meeting), corresponding to 12 pm Berlin time, if that helps.

Jegelewicz commented 3 years ago

@cboelling I understand, but the second time is meant for those who cannot make the first time. There would be two meetings.

afuchs1 commented 3 years ago

Hi Teresa

This time would also work best for me

Cheers Anne

From: Teresa Mayfield-Meyer @.> Reply to: tdwg/dwc @.> Date: Saturday, 26 June 2021 at 00:48 To: tdwg/dwc @.> Cc: Anne Fuchs @.>, Mention @.***> Subject: Re: [tdwg/dwc] Establish Task Group for MaterialSample (#358)

Looking at the poll results, we can get @albenson-usgs, @datadavev , @ghwhitbread and @dr-shorthair together at 4PM MDT. I would be happy to meet up with you all then and try to drag in at least one other person who attends the first meeting. Would this work?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe

Jegelewicz commented 3 years ago

A second meeting invite was sent this morning, intended primarily for those who cannot make the first, but all members were invited. @rondlg @tucotuco it would be great to have you guys at both just so we don't loose information and can keep the conversation focused. Thanks!

Jegelewicz commented 3 years ago

Also - sorry about all of the changes to the second invite - I think it should be correct now.

tucotuco commented 3 years ago

I intend to attend both. Thanks for setting these up.

deepreef commented 3 years ago

So... I got three invites this morning for the following dates/times: July 19 @ 1500UTC July 21 @ 2200UTC July 21 @ 1500UTC

I'm assuming the first of these was in error, but the calendar invite said that both of the latter two were "changed", so I just wanted to confirm.

I am available to attend all three days/times, but want to make sure I've got all the correct dates/times in my calendar.

Jegelewicz commented 3 years ago

@deepreef Sorry for that. The final day/time is July 21 @ 2200 UTC

Jegelewicz commented 3 years ago

Sorry if this is a very dumb question, but can someone please explain the difference between DarwinCore and ?

timrobertson100 commented 3 years ago

Sorry if this is a very dumb question, but can someone please explain the difference between DarwinCore and ?

You will surely get different answers to this, but here is one part of the answer.

Darwin Core is a vocabulary of terms, grouped into categories (e.g. Occurrence) which allow us to build things. Darwin Core Archives are one of the things we've built which allows datasets to be packaged into the "star-based" data structure with all it's known limitations, which we're looking in to. Darwin Core has been used for nearly 2 decades, so has been embedded in 1000s of institutional systems and processes - which can of course limit ability to change things.

openDS is an idea being built around the Digital Object Architecture. My understanding is that DOA has been around for about 20yrs and was opened up in around 2015 and put into the newly founded Dona Foundation. The idea is to have a single digital representation for each specimen (a digital twin) that is addressable through a handle (e.g. a DOI) and that can be edited by those with permission through the Digital Object Interface Protocol. The content of the object would be structured into both core (meta)data and also with links to other objects. The model is still fairly immature but looks like it will likely draw on Darwin Core for some of its properties.

Jegelewicz commented 3 years ago

Thank you! That's all gonna take me a while to process.

jbstatgen commented 3 years ago

@Jegelewicz Would you add me ( as member to the task group, too?

This will be the first time that I will be active in DwC, thus, I will need to find out what is involved in being a task group member. My path to the question of MaterialSample starts in SPNHC's Biodiversity Crisis Response Committee and its Regional Diversity subgroup working towards a campaign for expanding a Global Collections Network based on GBIF's GRSciColl, collaborating with GBIF's Data Products group. Working on the input options for GRSciColl and reviewing the data currently recorded in GRSciColl for "a couple" of fields with predefined vocabularies, I proposed a potential solution for some of the fields ( to one of the open issues of TDWG's Collections Descriptions Interest Group. In a subsequent email exchange Matt Woodburn from the IG pointed me to the forming MaterialSample-Task Group.

In addition, earlier this week as a small ad hoc-subgroup of people active in the Alliance for Biodiversity Knowledge and the Consultation for the Digital Extended Specimen (DES) concept Phase 2 we were looking into finding a good visualization (and description) for the DES and ended up integrating into our discussion the concepts of "evidence" and "token" from #314. My interest in the DES is for it contributing to the implementation of the Convention on Biological Diversity's post-2020 Global Biodiversity Framework and its monitoring. All kinds of (ecosystem, ecology,) species and genetic diversity data are part of that context and will need to be integrated. With both backgrounds, it seems to make sense to join the task group and I would be happy to be able to do so.

Jegelewicz commented 3 years ago

@jbstatgen I just sent an invite to the draft charter and a poll for meeting days/times. Welcome!

I also sent invites to two meetings - you only need to attend one.

jbstatgen commented 3 years ago

@Jegelewicz Thanks!