ESIPFed / sweet

Official repository for Semantic Web for Earth and Environmental Terminology (SWEET) Ontologies
Other
119 stars 33 forks source link

Act of Sampling #139

Closed dr-shorthair closed 3 years ago

dr-shorthair commented 5 years ago

Preferred term label

Sampling

Synonyms

Textual definition

Act to generate a material entity, or subset or population, designed to be representative of a larger entity or universe, usually for the purpose of making observations whose intention is to characterize the larger entity or universe.

Generates a Sample (#138)

bco:SpecimenCollectionProcess (from OBO) is a sub-class. stat:Sampling is a sub-class.

See https://www.w3.org/TR/vocab-ssn/#SOSASampling

Suggested parent term

Likely belongs in reprSciModel module, mod:Sampling Probably rdfs:subClassOf proc:Process

Attribution

https://orcid.org/0000-0002-3884-3420

dr-shorthair commented 5 years ago

See #107 and #127

kaiiam commented 5 years ago

Currently the BCO:specimen collection process has axioms:

has_specified_input some material entity
has_specified_output some specimen

@dr-shorthair am I correct in understanding that this Act of Sampling class would be intended to be a superclass to specimen collection process, which would instead of having the axiom has_specified_input some material entity have has_specified_input some thing?

Perhaps also a sister class to specimen collection process which would have something like has_specified_input someIAO:information content entity?

dr-shorthair commented 5 years ago

@kaiiam indeed, that is pretty much how I see it.

There are siblings to specimen collection, like statistical sampling. not entirely sure that the specified input is 'information' but you are certainly on the right track.

graybeal commented 5 years ago

Remind me again, what are we calling the act of sampling that is statistical, rather than physical?

Following from that question: As a label for this one, I strong prefer "Act of Physical Sampling" so (a) I don't constantly think it's the other one and use it incorrectly (where "I" can be replaced by "lots of people"), and (b) I don't have to look it up every time to see what it means.

(Larger concerns in my next comment about the different kinds of sampling…)

graybeal commented 5 years ago

@kaiiam @dr-shorthair Can I clarify with Kai your previous post which says "Perhaps also a sister class to specimen collection process " but then describes something that I think is a statistical collection process and not a specimen collection process.

As a related observation, my sense of the terminology in oceanography is that 'sampling' is when a part is taken of some material entity. When I am taking a whole organism or a whole rock, as in picking it up off the sea floor or capturing it out of the water, it is often (usually?) called 'collection'. I thought the same applied in terrestrial archaeology (and art/book collection, for that matter).

In the case of whole things being collected, it is not precise to say that collected thing is 'generated' (a word from the proposed definition). It existed before and it still exists. Its status has changed but not its existence.

So overall, I am not keen on the proposed class relations in the thread. (Or alternatively, the class relations might be great, but the labels and definitions of the superclass would need to be different for it to be clear to all users.) I'm not clear that 'grabbing a physical thing', 'grabbing a part of a physical thing', and 'grabbing part of some data' are sub-classes of the same concept. They feel like substantially different things to me.

dr-shorthair commented 5 years ago

OK - some statistical sampling is also physical sampling, resulting in a specimen that can be stored on a shelf. And some statistical sampling is not - such as selection of a population sample - in the sense that the output is not a physical thing that can be put on a shelf.
So 'statistical' is perhaps an orthogonal concern to 'physical'.

Nevertheless, the underlying intention is consistent - i.e. to be representative of a bigger thing for the purpose of making observations of some characteristic property.

kaiiam commented 5 years ago

I think the idea could be that at a high level such as in OBI's planned process there could be a class such as what @dr-shorthair is proposing about the general purpose process of sampling something. To which a class such as the BCO:specimen collection process could be subclass. What I'm saying is that there could be a another subclass of the general purpose sampling which is purley statistical which like @dr-shorthair says the output of is not a physical object, I'd say the output would be something like an IAO:information content entity or one of it's subclasses.

smrgeoinfo commented 5 years ago

possible revision of definition: Act to identify an entity intended to be representative of a larger entity or universe, usually for the purpose of making observations used to characterize the larger entity or universe.

Note: the identified entity might be a physical part (specimen) of a larger object (e.g. a water or rock specimen), or an individual or group of individuals from a population.

A rock that is collected from the ocean floor is collected and interpreted (identified) in the context of some larger feature (the oceanic crust). As just a piece of rock, it is not a specimen. What makes it a specimen is the intention that it represents sea-floor sediment, or MORB, or oceanic gabbro, or oceanic mantle.

An organism becomes a specimen when it is collected (identified) with the intention of representing the population of some species. Otherwise, it's a pet, a decoration, or maybe dinner.

A group of people (animals, trees, leaves, bacteria...) is a sample of a population if it is identified as representative of that population based on some intention. Otherwise, it's just a bunch of people (animals, trees, leaves, bacteria...).

graybeal commented 5 years ago

yes, I think you've nailed it @smrgeoinfo . The intention makes it a sample.

The only word in your definition I'm having trouble with is 'identify'. I'm thinking 'obtain', 'derive', or 'collect' is more like it, because 'identify' often means 'attach a unique identifier to', which is related but not the core idea IMHO.

dr-shorthair commented 5 years ago

@smrgeoinfo your explanation is good, but terminology could be better aligned with current practice.

In particular your use of the term 'specimen' in the above comment is problematic. Within the collections community a material object becomes a specimen when it enters a curation arrangement - gets a label, gets put in an archive, etc. I suggest we stick with that.

OTOH sample reflects the intention to be representative of something bigger.

So I would reword your contribution as follows:

Note: the identified entity might be a material part or element of a larger object (e.g. a water or rock specimen), or an individual or group of individuals from a population.

A rock that is collected from the ocean floor is collected and interpreted (identified) in the context of some larger feature (the oceanic crust). As just a piece of rock, it is not a specimen or sample. What makes it a specimen is that it is assigned an identifier and managed as part of a collection. What makes it a sample is the intention that it represents sea-floor sediment, or MORB, or oceanic gabbro, or oceanic mantle. Otherwise, it's a paperweight, doorstop, piece of pavement or potential weapon.

An organism becomes a sample when it is collected (identified) with the intention of representing the population of some species. Otherwise, it's a pet, a decoration, or maybe dinner.

gwemon commented 5 years ago

I have always found the term "specimen" problematic and I think this is because it means something confusingly different for different communities (medicine, geology, biology, ecology, etc...). However I think I am right in saying that whatever way the term "specimen" is used by a community, it will always have originated from an "Act of sampling" whether it is already considered a specimen at the time of collection or whether it becomes a specimen later on in the curation process. Therefore I think that it is wise to keep the term "specimen" out of the definition of the act of sampling. I believe that establishing relationships between "Act of Sampling" and terms that refer to "specimen" (like e.g. BCO:specimen collection process) should be sufficient.

kaiiam commented 5 years ago

@gwemon agreed, hence my suggestion for BCO:specimen collection process (which has a specimen as output) to be subclass to a term like "Act of Sampling". For the latter I presume we would want something very high level for the input and output or perhaps none specified?

My understanding of how BCO:specimen collection process works see figure 4 B from @ramonawalls 2016 paper Semantics in Support of Biodiversity Knowledge Discovery (it has subsequently been updated where BCO:material sampling process is now BCO:specimen collection process, and BCO:material sample is now OBI:specimen) is that the same BCO:specimen collection process can be applied recursively at all steps in a scientific specimen collection process. In the example from the paper (about marine DNA sampling), it is first applied when the sea water itself is collected resulting in an output of an ocean water sample as the 'specimen' (water in a container), then again when the water is filtered, (input ocean water sample, output filter with biological material on it) and a final time as as DNA extraction (input filter, output DNA).

@ramonawalls is this correct? Also would there be scope for a general purpose "Act of Sampling" type class suggested by @dr-shorthair in BCO? Perhaps as superclass to BCO:specimen collection process?

dr-shorthair commented 3 years ago

@gwemon not so sure. Some specimens in museums were acquired in a much less deliberate way. They do not become scientifically useful until their relationship with a bigger concept is understood (i.e. they become a 'sample'). For some specimens this never happens, and they remain dusty curated curiosities.

dr-shorthair commented 3 years ago

https://github.com/ESIPFed/sweet/blob/master/src/reprSciModel.ttl#L172 has generic Sampling concept