BiodiversityOntologies / bco

Biological Collections Ontology
Creative Commons Zero v1.0 Universal
21 stars 3 forks source link

Define high level observing, sampling, and collecting processes #92

Open ramonawalls opened 5 years ago

ramonawalls commented 5 years ago

These terms will go into OBI (see obi-ontology/obi#969), but will be used extensively in BCO, so I would like to reach some consensus among this group before proposing the definitions to OBI.

The proposed (and I think mostly accepted) OBI hierarchy will be:

assay -specimen collecting process (input material entity, output material entity) --material sampling process (outputs a physical specimen that is representative of larger population) -observing process (input material entity, output data) --observing process based on sampling (input material entity, output data that is intended to be representative of a larger population) --other kinds of observing processes

We will then import these classes into BCO and make subclasses specific for biodiversity, ecology, evolution, etc. (i.e., non-biomedicine).

Assay and specimen collecting process have been discussed extensively, and their definitions are stable and useful to many researchers, so I don't want to change those.

The terms that need clearer definitions are: -material sampling process -observing process -observing process based on sampling

I suggest that we use STATO statistical sampling process (http://purl.obolibrary.org/obo/STATO_0000502) -- a planned process which aims at assembling a population of observation units (samples) in as an unbiaised manner as possible in order to obtain or infer information about the actual population these samples have been drawn -- to help define material sampling process and observing process based on observation.

I will post strawman definitions for discussion in the comments.

robgur commented 5 years ago

I agree its critical but an observing process does not have to have an intent towards unbiased sampling, at least in my view. Is there a way to do this more efficiently on a teleconference than sorting this on github (or email)

On Tue, Oct 9, 2018 at 7:44 PM Ramona Walls notifications@github.com wrote:

These terms will go into OBI (see obi-ontology/obi#969 https://github.com/obi-ontology/obi/issues/969), but will be used extensively in BCO, so I would like to reach some consensus among this group before proposing the definitions to OBI.

The proposed (and I think mostly accepted) OBI hierarchy will be:

assay -specimen collecting process (input material entity, output material entity) --material sampling process (outputs a physical specimen that is representative of larger population) -observing process (input material entity, output data) --observing process based on sampling (input material entity, output data that is intended to be representative of a larger population) --other kinds of observing processes

We will then import these classes into BCO and make subclasses specific for biodiversity, ecology, evolution, etc. (i.e., non-biomedicine).

Assay and specimen collecting process have been discussed extensively, and their definitions are stable and useful to many researchers, so I don't want to change those.

The terms that need clearer definitions are: -material sampling process -observing process -observing process based on sampling

I suggest that we use STATO statistical sampling process ( http://purl.obolibrary.org/obo/STATO_0000502) -- a planned process which aims at assembling a population of observation units (samples) in as an unbiaised manner as possible in order to obtain or infer information about the actual population these samples have been drawn -- to help define material sampling process and observing process based on observation.

I will post strawman definitions for discussion in the comments.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BiodiversityOntologies/bco/issues/92, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcc7PoPvpWcu8U7QIIvao_oLCH3VTDXks5ujTT3gaJpZM4XUU08 .

dr-shorthair commented 5 years ago

@ramonawalls some suggested changes -specimen collecting process (input material entity, output material entity) --material sampling process (outputs a physical specimen that is representative of larger population or entity) -observing process (input material entity , output data) --observing process based on sampling (input material entity, output data that is intended to be representative of a larger population) --other kinds of observing processes

@robgur I was thinking about biased and unbiased sampling earlier. Biased sampling is used commonly in geochemistry - e.g. crushing and then taking all the dense, or magnetic grains. So I initially bristled at the definition of sampling that @ramonawalls quoted from STATO which says that sampling should be unbiased. But I think its OK - its just that the population that is being characterized is the heavy/magnetic part of the rock formation (in the geology case) so while the sub-sampling of the initial specimen is biased, it is intended to be an unbiased representation of something else. Does this apply to the cases that you have in mind?

ramonawalls commented 5 years ago

Will try to schedule a call for next week. Working on definitions now.

ramonawalls commented 5 years ago

Everyone who is interested, but at least @dr-shorthair @robgur @pbuttigieg @tucotuco please fill out the doodle poll at https://doodle.com/poll/6zibkfq6nww2spqn ASAP

ramonawalls commented 5 years ago

Everyone who is interested, but at least @dr-shorthair @robgur @pbuttigieg @tucotuco please fill out the doodle poll at https://doodle.com/poll/6zibkfq6nww2spqn ASAP

I did not realize it was going to do three hour block, but please supply your general availability, then we can narrow down.

ramonawalls commented 5 years ago

@dr-shorthair @robgur @pbuttigieg @tucotuco Sorry to make you all do this twice, but please fill out the Doodle poll again, now with times that (more or less) work for all time zones.

robgur commented 5 years ago

I have 12am-2am as one of my choices for time of meeting, and my time zone set to New York. Are you sure that is right? I mean that seems like "less" to me re" working for all time zones. -r

On Thu, Oct 11, 2018 at 12:37 PM Ramona Walls notifications@github.com wrote:

@dr-shorthair https://github.com/dr-shorthair @robgur https://github.com/robgur @pbuttigieg https://github.com/pbuttigieg @tucotuco https://github.com/tucotuco Sorry to make you all do this twice, but please fill out the Doodle poll again, now with times that (more or less) work for all time zones.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BiodiversityOntologies/bco/issues/92#issuecomment-429026971, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcc7Bzn3TaPW2YGOgIAx9Va34qr1YsMks5uj3O6gaJpZM4XUU08 .

dr-shorthair commented 5 years ago

Are you in New York next week Rob? Indeed, that would be the problem. Essentially 19:00 UTC is the only option. https://www.timeanddate.com/worldclock/meetingtime.html?iso=20181016&p1=152&p2=37&p3=179&p4=197&p5=224

robgur commented 5 years ago

No, not in New York but my time zone here in Florida is the same - Eastern Daylight Time. -r

On Thu, Oct 11, 2018 at 6:35 PM Simon Cox notifications@github.com wrote:

Are you in New York next week Rob? Indeed, that would be the problem. Essentially 19:00 UTC is the only option.

https://www.timeanddate.com/worldclock/meetingtime.html?iso=20181016&p1=152&p2=37&p3=179&p4=197&p5=224

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BiodiversityOntologies/bco/issues/92#issuecomment-429142892, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcc7PLQfx1Z0wHeERIKVhW-bf-x9kHMks5uj8ejgaJpZM4XUU08 .

dr-shorthair commented 5 years ago

If Guru wants to join us from Brisbane it is essentially impossible https://www.timeanddate.com/worldclock/meetingtime.html?iso=20181016&p1=152&p2=37&p3=156&p4=197&p5=224&p6=47

ramonawalls commented 5 years ago

Right. I figured there was no time that would work for everyone, so I throw some options out there and see who can make it. Rob, if you can make the times on Monday or Wednesday that Simon has checked we’ll go for that, and I can meet with Pier later as needed.

Enviado de Baja Arizona

On Oct 11, 2018, at 4:13 PM, Simon Cox notifications@github.com wrote:

If Guru wants to join us from Brisbane it is essentially impossible https://www.timeanddate.com/worldclock/meetingtime.html?iso=20181016&p1=152&p2=37&p3=156&p4=197&p5=224&p6=47

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ramonawalls commented 5 years ago

See the "processes" sheet at https://docs.google.com/spreadsheets/d/1_zrr5IOlOVtCFqTS8FX7dNLff-O_YVyEqGsH9CNH7U0/edit#gid=22595233

dr-shorthair commented 4 years ago

I see that specimens are back in the mix https://github.com/BiodiversityOntologies/bco/issues/94#issuecomment-606877749. So can I re-open the samples and specimens discussion?

AFAIK the collections (museums) community defines a Specimen as a material-entity that is explicitly curated. And to science and stats practitioners a Sample is an (usually continuant) entity that is designed to be representative of a larger entity, which might be a population, universe (and usually continuant).

Samples are not always specimens - statistical samples in social science are not curated, for example. And samples are not necessarily material entities.

Specimens are not always samples - though I suspect most are, because why would you curate it if it was not representative of a larger truth? (Specimens in a fine-art museum or gallery are representative of 'things of beauty' or some related concept.) The key to its sample-ness is that we can explain that there is larger entity, related through an isSampleOf relation.

FWIW - a type-specimen is a sample because is it is representative of a taxon (?).

ramonawalls commented 4 years ago

For sure, Simon. My work plan is to first update the workflow for Darwin Core imports and do a release with them as modules, then dive fully into specimens, sampling, and observations. I won't make any permanent changes without your input! Will probably schedule a call about it in a few weeks.