obi-ontology / obi

The Ontology for Biomedical Investigations
http://obi-ontology.org
Creative Commons Attribution 4.0 International
75 stars 27 forks source link

Attempting to understand the difference between sample and specimen #1013

Closed cmungall closed 4 years ago

cmungall commented 5 years ago

Context: trying to work out if these should be in obo-core, and if they can be generalized beyond biology, e.g. geosamples.

Both sample and specimen are textually and logically defined in terms of shadow "role" classes. I think this is fine logically, but in my experience this kind of thing is potentially confusing and frustrating to users. If they see that material sample is defined as "A material entity that has the material sample role" it sounds weirdly circular (I know technically it is not circular, but it sounds it). Compounding this is the fact that the role term is in free text, they can't click on it to explore further. As experts we know to look to the equivalence axioms and click there. But many tools may not even show the equivalence axiom. And if we MIREOT this in to obo-core we may lose the context (unless we also shadow the role).

I suggest as a general definitional pattern unrolling the role definition. This violates DRY a little (but then arguably the shadow hierarchy violates DRY).

Also, editor note on specimen says

Note: definition is in specimen creation objective which is defined as an objective to obtain and store a material entity for potential use as an input during an investigation.

It appears that maybe "specimen creation objective" was once a term? I think this editor note is confusing as the language is different from the language in "specimen role" which is where the real definition lies.

So taking the roles as being primary we have this:

Note the single-child structure: to me this is often indicative of a broader issue

OK, so we need to follow "material sampling process". Note this class isn't linked in a logical axiom so I have to do a search in OLS to find "material sampling process"

The definition of which is:

A specimen gathering process with the objective to obtain a specimen that is representative of the input material entity

So I have to admit after clicking lots of links and exploring lots of parallel hierarchies and scribbling notes on pieces of paper, I'm still not clearer on what the intended difference is between a specimen and sample is. I think that a non-sample specimen is any specimen that is not representative of the input. But this is completely contrary to the definition of specimen.

Minor additional issues:

I realize it's easy to be critical, I appreciate the careful work that has gone in here, but I am guessing that a lot of this was done a while ago when as a community we may have been overly influenced by certain overly formal ways of doing things. Some suggestions for things to make this easier for users:

Hope this helps.. but back to my original question, what differentiates sample from specimen?

bpeters42 commented 5 years ago

specimen: things collected and stored in a way that they could be examined sample: A specimen that can be considered an example of a broader whole

So a stone collected from a beach in San Diego in 1999 could be considered a 'San Diego beach stone sample', or also a 'meteorite sample' or a '1999 stone sample'. Specimens get collected and stored. Often the designation and purpose of them changes. Their identity doesn't. That is what we are trying to capture. 'specimen' has no intent. 'sample' does.

On Tue, Apr 16, 2019 at 6:19 PM Chris Mungall notifications@github.com wrote:

Context: trying to work out if these should be in obo-core, and if they can be generalized beyond biology, e.g. geosamples.

Both sample and specimen are textually and logically defined in terms of shadow "role" classes. I think this is fine logically, but in my experience this kind of thing is potentially confusing and frustrating to users. If they see that material sample is defined as "A material entity that has the material sample role" it sounds weirdly circular (I know technically it is not circular, but it sounds it). Compounding this is the fact that the role term is in free text, they can't click on it to explore further. As experts we know to look to the equivalence axioms and click there. But many tools may not even show the equivalence axiom. And if we MIREOT this in to obo-core we may lose the context (unless we also shadow the role).

I suggest as a general definitional pattern unrolling the role definition. This violates DRY a little (but then arguably the shadow hierarchy violates DRY).

Also, editor note on specimen https://www.ebi.ac.uk/ols/ontologies/obi/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FOBI_0100051 says

Note: definition is in specimen creation objective which is defined as an objective to obtain and store a material entity for potential use as an input during an investigation.

It appears that maybe "specimen creation objective" was once a term? I think this editor note is confusing as the language is different from the language in "specimen role" which is where the real definition lies.

So taking the roles as being primary we have this:

Note the single-child structure: to me this is often indicative of a broader issue

OK, so we need to follow "material sampling process". Note this class isn't linked in a logical axiom so I have to do a search in OLS to find "material sampling process"

The definition of which is:

A specimen gathering process with the objective to obtain a specimen that is representative of the input material entity

So I have to admit after clicking lots of links and exploring lots of parallel hierarchies and scribbling notes on pieces of paper, I'm still not clearer on what the intended difference is between a specimen and sample is. I think that a non-sample specimen is any specimen that is not representative of the input. But this is completely contrary to the definition of specimen.

Minor additional issues:

  • why is sample population a synonym of sample?
  • is "term editor: OBO workshop" a useful piece of metadata?
  • Example of usage for specimen says "Biobanking of blood taken and stored in a freezer for potential future investigations stores specimen." - I can't quite parse this, not sure if it's just me
  • The editor note on specimen role is a bit impenetrable
  • formally the examples of usage on specimen role are not examples or roles but are of examples of material entities that have the roles

I realize it's easy to be critical, I appreciate the careful work that has gone in here, but I am guessing that a lot of this was done a while ago when as a community we may have been overly influenced by certain overly formal ways of doing things. Some suggestions for things to make this easier for users:

  • reduce duplication over parallel hierarchies (role, IC, process).
  • where shadow hierarchies exist, unroll text definitions to users don't have to search and click and manually append definitions
  • if a logical equivalence axiom exists, consider if the text definition needs to be stated so formally, given you already have the formal definition
  • if there is no logical definition, but there is a formal-sounding text definition, consider if logical axioms are missing or if the text definition is "pseudo-formal"
  • test things on users

Hope this helps.. but back to my original question, what differentiates sample from specimen?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/obi-ontology/obi/issues/1013, or mute the thread https://github.com/notifications/unsubscribe-auth/ANN9InaD34O3vD8gNpwBST4mvZ4ulJgWks5vhna9gaJpZM4cz_7- .

-- Bjoern Peters Professor La Jolla Institute for Allergy and Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

cmungall commented 5 years ago

Thanks, this helps.

I think when you say a specimen has no intent, you mean it doesn't necessarily have intent (since specimen is the superclass of sample)

But is this all in line with dictionary definitions of specimen? E.g. https://www.merriam-webster.com/dictionary/specimen "an individual, item, or part considered typical of a group, class, or whole"

I'm still having a hard time conceiving of a non-sample specimen.

ddooley commented 5 years ago

I agree "sample population" shouldn't be a synonym of sample.

Your point is that even typing a specimen to a class entails that it is a sample of the class? "The bacterial specimen" => "sample of bacteria". So something remains a specimen in the pure sense only while it is referenced by an identifier.

I had framed it like: A 'sample' references a thing with respect to some sample set which has membership criteria that define the population it is talking about. A 'specimen' references a thing in the context of its collection process, storage, retrieval or material processing. I was going to argue that these processes don't need to know anything about a specimen but then realized a process is only applied to compatible types of specimen.

cmungall commented 5 years ago

Your point is that even typing a specimen to a class entails that it is a sample of the class? "The bacterial specimen" => "sample of bacteria"

I'm just looking for differentia between these two, and an example of a bacterial specimen that is not a sample

So something remains a specimen in the pure sense only while it is referenced by an identifier.

Sorry, I'm not following

I had framed it like: A 'sample' references a thing with respect to some sample set which has membership criteria that define the population it is talking about. A 'specimen' references a thing in the context of its collection process, storage, retrieval or material processing. I was going to argue that these processes don't need to know anything about a specimen but then realized a process is only applied to compatible types of specimen.

If I follow this then I think there may be some lurking logical problems.

Proposal: merge specimen into sample. If there are particular nuanced distinctions people want to make about collection context, do this on a per subclass or per instance basis

ddooley commented 5 years ago

I mean, I tend now to agree. About the only time I can talk about a specimen without carrying sample semantics is if I talk about specimen #4234. The moment I say what kind of specimen it is, it becomes at the very least, a sample of that kind. I randomly collect specimens. They are now at the very least samples of things randomly collected somewhere... Use of "sample" does imply accompanying sample set membership criteria though. If people want to avoid that in a given conversation, they may be turning to "specimen".

bpeters42 commented 5 years ago

When an autopsy is performed, and the liver extracted to look for signs of damage. That is a specimen. When a surgical biopsy is performed on a suspiscios lump in the lung to examine if that is a metastasis, that is a specimen. Real life example: we are now using such lung specimens that ended up being granulomas rather than tumors to study TB.

On Thu, Apr 18, 2019, 9:26 PM Damion Dooley notifications@github.com wrote:

I mean, I tend now to agree. About the only time I can talk about a specimen without carrying sample semantics is if I talk about specimen

4234. The moment I say what kind of specimen it is, it becomes at the very

least, a sample of that kind. I randomly collect specimens. They are now at the very least samples of things randomly collected somewhere... Use of "sample" does imply accompanying sample set membership criteria though. If people want to avoid that in a given conversation, they may be turning to "specimen".

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/obi-ontology/obi/issues/1013#issuecomment-484763683, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJX2ISDK5QYS4QD6OZMIILPRFCW3ANCNFSM4HGP737A .

cmungall commented 5 years ago

With the biopsy case, why does that not fit your definition of sample? Is not the specimen of tissue representative of the broader whole (e.g. a tumor)? I agree it is perhaps odd to call the whole extracted liver a sample but this just seems like a terminological quirk.

If this is an important distinction (and I still don't understand it, or why it's important to make this distinction, sorry), then I think there should be a class for the non-sample specimen, otherwise you have a single-isa-child situation:

Barry Smith has long pointed this out as an anti-pattern, and I agree. For example, you can't explicitly assert that you have a non-sample specimen (without using an awkward ComplementOf construct) and consequently you can't query for these (due to the open world assumption, you can't just query for direct assertions to the class "specimen" since these may still be samples).

And I am not convinced the terminology is right here. The dictionary definition of specimen seems to talk of what OBI calls samples, and descriptions of the biopsy situations routinely talk of samples. E.g. https://www.cancer.org/treatment/understanding-your-diagnosis/tests/testing-biopsy-and-cytology-specimens-for-cancer/what-happens-to-specimens.html

I appreciate there was likely a rich history of discussion back when these terms were created a while ago, but these are not easily available to users now.

Hypothetically, if specimen were merged into sample, is there a use case that could not be supported?

bpeters42 commented 5 years ago

The case I am describing is not a biopsy, but surgical resection of the entire mass discovered in a patient. The resulting material is called a 'surgical specimen'. It is not representative of any 'whole'.

The need for both 'sample' and 'specimen' comes from the desire to include in the definition of the sample what the whole is that is being sampled. Because we want to model the process of inductive reasoning, how data from measurements on samples is used to make conclusion on the whole. However, we then ran into the issue at what point that connection between sample and whole is made. In many cases, such as biobanking, specimens are collected and stored in order to make them available for future studies that are completely undefined when the specimens are collected. The physical collection processes (e.g. surgical extraction or blood draws into different tubes) is exactly the same if there is a set intended sampling or not. We wanted the ability to talk about these collection processes in general though, and found that everyone was comfortable with 'specimen collection' as a parent, and the sample role comes into being in the context of a study design (which may exist prior to or after the specimen collection).

We are behind in all of our documentation efforts, such as this.

If you insist that specimen should be dropped for a less

On Thu, Apr 18, 2019 at 10:14 PM Chris Mungall notifications@github.com wrote:

With the biopsy case, why does that not fit your definition of sample? Is not the specimen of tissue representative of the broader whole (e.g. a tumor)? I agree it is perhaps odd to call the whole extracted liver a sample but this just seems like a terminological quirk.

If this is an important distinction (and I still don't understand it, or why it's important to make this distinction, sorry), then I think there should be a class for the non-sample specimen, otherwise you have a single-isa-child situation:

  • specimen
    • sample

Barry Smith has long pointed this out as an anti-pattern, and I agree. For example, you can't explicitly assert that you have a non-sample specimen (without using an awkward ComplementOf construct) and consequently you can't query for these (due to the open world assumption, you can't just query for direct assertions to the class "specimen" since these may still be samples).

And I am not convinced the terminology is right here. The dictionary definition of specimen seems to talk of what OBI calls samples, and descriptions of the biopsy situations routinely talk of samples. E.g. https://www.cancer.org/treatment/understanding-your-diagnosis/tests/testing-biopsy-and-cytology-specimens-for-cancer/what-happens-to-specimens.html

I appreciate there was likely a rich history of discussion back when these terms were created a while ago, but these are not easily available to users now.

Hypothetically, if specimen were merged into sample, is there a use case that could not be supported?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/obi-ontology/obi/issues/1013#issuecomment-484770040, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJX2IST7B62T7MOTP2PWV3PRFIMRANCNFSM4HGP737A .

-- Bjoern Peters Professor La Jolla Institute for Allergy and Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

ddooley commented 5 years ago

A paper on pooled sampling uses "individual sample" and "specimen" / "individual specimen" interchangeably, and talks about them as members of a pooled sample, but never as members of a "pooled specimen": https://www.researchgate.net/publication/258063177 (request from authors). A specimen, if not referenced as an individual sample, is the material from which a sample is ultimately derived.

The cost of analysing multiple individual specimens compared to fewer pooled samples is higher within this study, and if cost is the major restraint, than pooled samples should be considered. However, thought should be given to storing specimens individually, as this allows for specimen variance to be analysed regularly. These sampling strategies can be used as a basis for making decisions on sample storage for retrospective studies made at a later time.

... using only one pooled sample (material from 12 individual specimens). However, when CVt is as low as 16%, just one individual sample is needed to meet the set criteria.

Linguistically reference to a specimen carries no implicit relation to some sample set, whereas reference to a sample always implies some sample set criteria. This could be formalized logically, but would require a new 'sample set' term

sample: equivalentTo 'specimen' and 'member of' some 'sample set'.

This doesn't describe the sample set membership criteria, but that would be the stuff of further axioms.

johnwjudkins commented 4 years ago

Discussed 2019-10-28. Should be discussed in OBO-Core. Damion will work on documentation.

zhengj2007 commented 4 years ago

related issue: https://github.com/obi-ontology/obi/issues/620

ddooley commented 4 years ago

p.s. Here's a diagram I'll send out for comment that shows the distinction between specimen and sample: image