The current definition of the 'attribute' class (and its parent) in the Biolink Model is the following:
abstract entity:
description: >-
Any thing that is not a process or a physical mass-bearing entity
attribute:
subclass_of: PATO:0000001
mappings:
- SIO:000614
is_a: abstract entity
mixins:
- ontology class
description: >-
A property or characteristic of an entity.
For example, an apple may have properties such as color, shape, age, crispiness.
An environmental sample may have attributes such as depth, lat, long, material.
slots:
- has attribute type
- has quantitative value
- has qualitative value
in_subset:
- samples
There are a number of concerns with this definition. This issue reviews them to guide possible Biolink Model revisions of the handling of the attribute class.
The semantic anchoring of the class with subclass_ofPATO:0000001.
Inheritance of 'attribute' semantics from ontology class
First, 'ontology class' is cited as a mixin. We won't discuss the current general confusion with mixins under review in Biolink issue https://github.com/biolink/biolink-model/issues/333 and elsewhere, but just note that it is one route which injects ontology class semantics into the attribute class.
If you look at "has attribute type", this latter slot has a range 'ontology class' as well and seems to be a duplication of semantic intent alongside the mixin.
Attribute identification and labelling
When an attribute is to be used in a practical context (e.g. in a knowledge graph KGX file, Neo4j or otherwise), one needs to ask how it is to be distinctly identified and given a (human readable) name.
We note that the parent class abstract entity is just has a description but without slots. The intent of the abstract entity parent is to avoid injection of the full semantics of named thing into attribute. However, this leaves attribute rather bare of any identification or label whatsover.
Of course, ontology class (noted above) may potentially inject an id, name and category into the attribute by way of inheritance from named thing although this is ironically at odds with intent of inheritance from abstract entity.
Other attribute slots
As noted above, the has attribute type slot has overlapping semantics with the ontology class mixin but is, by default, required: false. The mixin is in some sense, mandatory, although its slots (see previous section) seem to also default to required: false.
The two other slots:
- has quantitative value
- has qualitative value
seem quite useful as (optional) bindings for qualitative or quantitative attribute values. One could also use the ontology term bound by the has attribute type as a simple boolean assertion of the given term concept as tag.
In comparison, though, the TRAPI specification has a broader data model for 'Attribute' (OpenAPI 3 syntax, not Biolink YAML):
Attribute:
type: object
description: Generic attribute for a node
properties:
name:
type: string
description: >-
Human-readable name or label for the attribute. Should be the name
of the semantic type term.
example: PubMed Identifier
value:
example: 32529952
description: >-
Value of the attribute. May be any data type, including a list.
type:
type: string
description: >-
CURIE of the semantic type of the attribute, from the EDAM ontology
if possible. If a suitable identifier does not exist, enter a
descriptive phrase here and submit the new type for consideration
by the appropriate authority.
example: EDAM:data_1187
url:
type: string
description: >-
Human-consumable URL to link out and read about the attribute (not
the node).
example: https://pubmed.ncbi.nlm.nih.gov/32529952
source:
type: string
description: Source of the attribute, as a CURIE prefix.
example: UniProtKB
required:
- type
- value
additionalProperties: false
It may be worth formally comparing and possibly aligning TRAPI with the Biolink Model (or rather, vice versa?).
Use of the attribute class
It seems that all instances of the attribute class are linked upstream to the entity they describe by the slot:
has attribute:
description: >-
connects any named thing to an attribute
range: attribute
multivalued: true
Semantically, this is fine, but once again, the question still remains about how an attribute (as discrete elements of annotation on an entity) is to be identified and labelled (see previous sections).
In addition, the class has no domain restriction, thus in principle, it could annotate any class in the model; however, only a single class in the model actually explicitly lists has attribute as a slot, namely:
material sample:
is_a: named thing
...
slots:
- has attribute
Given that the default status of a defined slot is required: false, would there be any harm in formally listing has attribute as an additional slot of the named thing category definition? Then, it could be removed from material sample but implicitly available to all other categories, for ad hoc node properties.
Note that TRAPI allows (arrays of) Attribute instances for both nodes and edges. This would be comparable to also formally listing 'has attributeas a slot under theassociation` class.
One idea to consider here is to specify a common abstract parent class for both named thing and association (not sure what to call it but) within which to declare some common slots like has attribute (and possibly, id and possibly name) by inheritance something like:
concept or relationship:
slots:
- id
- name
- has attribute
slot_usage:
id:
required: true
named thing:
is_a: concept or relationship
# note that 'id' and 'name' could be removed from 'named thing'
# since they would now come in by parental inheritance
slots:
- category
slot_usage:
name:
required: true
...
association:
is_a: concept or relationship
# name slot isn't mandatory here...
...
An alternate, perhaps better, approach would be to use mixin to achieve the same effect of injecting shared semantics across the various model components discussed above.
attribute mixin:
mixin: true
abstract: true
slots:
- has attribute
with
named thing:
...
mixins:
- attribute mixin
and
association:
...
mixins:
- attribute mixin
Whatever approach is used, sharing attributes across NamedThing and Association classes could also facilitate efforts to align the Biolink Model with TRAPI, plus provide a helpful mechanism for more extensive modelling of knowledge graphs with, say, evidence, provenance and confidence annotations.
Note: See PR https://github.com/biolink/biolink-model/pull/539 for a significant resolution of this issue...
The current definition of the 'attribute' class (and its parent) in the Biolink Model is the following:
There are a number of concerns with this definition. This issue reviews them to guide possible Biolink Model revisions of the handling of the
attribute
class.The semantic anchoring of the class with
subclass_of
PATO:0000001.This is discussed in Biolink issue https://github.com/biolink/biolink-model/issues/501 so we won't elaborate the concerns with this any further here.
Inheritance of 'attribute' semantics from
ontology class
First, 'ontology class' is cited as a
mixin
. We won't discuss the current general confusion with mixins under review in Biolink issue https://github.com/biolink/biolink-model/issues/333 and elsewhere, but just note that it is one route which injectsontology class
semantics into theattribute
class.If you look at "has attribute type", this latter slot has a range 'ontology class' as well and seems to be a duplication of semantic intent alongside the mixin.
Attribute identification and labelling
When an
attribute
is to be used in a practical context (e.g. in a knowledge graph KGX file, Neo4j or otherwise), one needs to ask how it is to be distinctly identified and given a (human readable) name.We note that the parent class
abstract entity
is just has a description but without slots. The intent of theabstract entity
parent is to avoid injection of the full semantics ofnamed thing
intoattribute
. However, this leavesattribute
rather bare of any identification or label whatsover.Of course, ontology class (noted above) may potentially inject an
id
,name
andcategory
into theattribute
by way of inheritance fromnamed thing
although this is ironically at odds with intent of inheritance fromabstract entity
.Other
attribute
slotsAs noted above, the
has attribute type
slot has overlapping semantics with theontology class
mixin but is, by default,required: false
. The mixin is in some sense, mandatory, although its slots (see previous section) seem to also default torequired: false
.The two other slots:
seem quite useful as (optional) bindings for qualitative or quantitative
attribute
values. One could also use the ontology term bound by thehas attribute type
as a simple boolean assertion of the given term concept as tag.In comparison, though, the TRAPI specification has a broader data model for 'Attribute' (OpenAPI 3 syntax, not Biolink YAML):
It may be worth formally comparing and possibly aligning TRAPI with the Biolink Model (or rather, vice versa?).
Use of the
attribute
classIt seems that all instances of the
attribute
class are linked upstream to the entity they describe by the slot:Semantically, this is fine, but once again, the question still remains about how an
attribute
(as discrete elements of annotation on an entity) is to be identified and labelled (see previous sections).In addition, the class has no domain restriction, thus in principle, it could annotate any class in the model; however, only a single class in the model actually explicitly lists
has attribute
as a slot, namely:Given that the default status of a defined slot is
required: false
, would there be any harm in formally listinghas attribute
as an additional slot of thenamed thing
category definition? Then, it could be removed frommaterial sample
but implicitly available to all other categories, for ad hoc node properties.Note that TRAPI allows (arrays of) Attribute instances for both nodes and edges. This would be comparable to also formally listing 'has attribute
as a slot under the
association` class.One idea to consider here is to specify a common abstract parent class for both
named thing
andassociation
(not sure what to call it but) within which to declare some common slots likehas attribute
(and possibly,id
and possiblyname
) by inheritance something like:An alternate, perhaps better, approach would be to use
mixin
to achieve the same effect of injecting shared semantics across the various model components discussed above.with
and
Whatever approach is used, sharing
attributes
acrossNamedThing
andAssociation
classes could also facilitate efforts to align the Biolink Model with TRAPI, plus provide a helpful mechanism for more extensive modelling of knowledge graphs with, say, evidence, provenance and confidence annotations.