BioSchemas / specifications

Issue tracker, technical wiki, and example markup
https://bioschemas.org
54 stars 52 forks source link

BioChemEntity Definition #215

Closed AlasdairGray closed 5 years ago

AlasdairGray commented 6 years ago

Currently the definition for BioChemEntity states:

Specification (0.5):

A BioChemEntity is any object that exists in the physical world and cannot be better represented with any other existing type in schema.org. This includes theoretical objects that may potentially be invented/created, but do not currently exist. For example, synthetic possible chemicals. BioChemEntity is a flexible and extensible wrapper for Life Sciences entities.

Latest Draft (0.5-draft):

A BioChemEntity is any object that exists in the physical world representing biological, chemical and biochemical entities, and cannot be better represented with any other existing type in schema.org (for instance there is a full extension for the Medical field).

These definitions are problematic since the addition of a new type, e.g. a Protein type, would require existing markup to be updated.

AlasdairGray commented 6 years ago

Could the following phrasing work

Any biological, chemical, or biochemical object. For example: a protein; a gene; a chemical; a synthetic chemical; a computationally generated protein annotation.

JervenBolleman commented 6 years ago

This is related to #214, the BioChemEntity is difficult to use for providers due to the way it is rooted singular in the physical world. Most of our databases are related to things in the physical world but are collections of intangibles themselves.

Therefore I would suggest phrasing in the direction of.

Any concept related to biology and/or chemistry. For example information about genes; proteins; chemicals; or organisms.

The modelling of realworld objects is a valid approach, but if that was the goal why not just move e.g. SIO directly into schema,org? I would definitely enjoy working on that!

Then the question would be does BioChemEntity derive from schema:CreativeWork or schema:Intangible

kyook commented 6 years ago

I am missing if we are defining BioChemEntities as genes, variations, diseases, proteins, etc. themselves or if BioChemEntities is include only the related information, in which case, how would the actual genes, variations, diseases, etc. be tagged?

If the former, perhaps the following slight tweak to Jerven's definition would work? "Any concept related to biology and/or chemistry. For example genes; proteins; chemicals; or organisms, and pertaining information" ?

JervenBolleman commented 6 years ago

@kyook Well diseases would be health:MedicalCondition which is a different schema extension. Chemicals could be health:Substance.

I would say we are describing the information about Genes, not the Genes themselves. I we are talking about WBGene00012939 not one of the millions of billions of DNA sequences in the physical world that are instances described by that record.

In my opinion we need to describe what is in our databases, and leave it to the reader what can be inferred from that in the real world.

Key part is enabling the modelling of how WBGene00012939 and Q9XX03 relate.

AlasdairGray commented 6 years ago

@JervenBolleman wrote

the BioChemEntity is difficult to use for providers due to the way it is rooted singular in the physical world. Most of our databases are related to things in the physical world but are collections of intangibles themselves.

Is that due to my use of the word "object" in the definition. I do prefer your use of the word "concept".

Then the question would be does BioChemEntity derive from schema:CreativeWork or schema:Intangible

BioChemEntity is proposed to extend directly from schema.org/Thing.

I entirely agree with your sentiment that we are using it to markup the content of our databases rather than the real thing itself.

AlasdairGray commented 6 years ago

@kyook You can find an example of how BioChemEntity is used to markup a database entry for a protein at https://github.com/BioSchemas/specifications/blob/master/Protein/examples/ProteinEntity-with-context_jsonld.json

JervenBolleman commented 6 years ago

@JervenBolleman wrote

the BioChemEntity is difficult to use for providers due to the way it is rooted singular in the physical world. Most of our databases are related to things in the physical world but are collections of intangibles themselves.

Is that due to my use of the word "object" in the definition. I do prefer your use of the word "concept". Yes, of course in your new definition it is much weaker than before, so less of an issue than before. But I would like the guidance on it what the core bioschema entitity is to be strong.

Then the question would be does BioChemEntity derive from schema:CreativeWork or schema:Intangible

BioChemEntity is proposed to extend directly from schema.org/Thing. Which is a fine choice as well. I just feel (not strongly) that rooting it in the Intangible or CreativeWork is making it clear that we talk about a concept instead of a molecule.

JervenBolleman commented 6 years ago

@kyook You can find an example of how BioChemEntity is used to markup a database entry for a protein at https://github.com/BioSchemas/specifications/blob/master/Protein/examples/ProteinEntity-with-context_jsonld.json

@AlasdairGray this example is riddled with errors and does not match the current 0.5 Protein specification in key details once interpreted. I opened issue #218 so someone can fix this.

AlasdairGray commented 5 years ago

There is now a new proposal for how Bioschemas will deal with types in the life sciences. This issue is likely to be resolved with the proposal.

AlasdairGray commented 5 years ago

BioChemEntity is proposed to extend directly from schema.org/Thing.

Please note that schema.org/Thing is the most general type in schema.org. All other types, including schema.org/Intangible inherit from schema.org/Thing.

AlasdairGray commented 5 years ago

The most recent definition is

Any biological, chemical, or biochemical thing. For example: a protein; a gene; a chemical; a synthetic chemical.

Can we close this issue?

AlasdairGray commented 5 years ago

With the outcomes of the May F2F meeting, I believe that this issue has now been resolved.