A project supporting the DRAO application ontology, a hierarchy of specific research domains and descriptors which imports subsets of terms from over 50 publicly-available ontologies.
Sequence and its children are currently EDAM terms within the Data hierarchy as shown here:
We plan to refactor these classes and move them to a different place in the hierarchy. Details of each mapping are in the Mapping section below. SO region is already present in DRAO, but it is not visible to FAIRsharing users as it does not have the inSubset="FAIRsharing" flag set. Here is the current view of that portion of DRAO:
So we will need to:
make region visible to FAIRsharing
remove region as a label prior to releasing the ontology, instead using the FAIRsharing label "Sequence".
adjust hierarchy as follows...
The overall hierarchy will be as follows, with detailed explanations of why each position/IRI was chosen further down.
region (Sequence) - already in DRAO (will become visible)
_biologicalregion - already in DRAO (not visible)
_polypeptideregion - already in DRAO (will become visible, replace Protein sequence which has a FAIRsharing label of Amino acid sequence, and gain FAIRsharing label Amino acid sequence. We will manually add the alternative term Protein sequence to preserve expected annotation)
Nucleic acid sequence - keep EDAM term, move hierarchy only
DNA sequence - keep EDAM term, move hierarchy only
RNA sequence - keep EDAM term, move hierarchy only
Why are you suggesting this change?
As you can see, the _sequencefeature hierarchy is much more detailed than what we have for EDAM Sequence, and so it makes ontological sense to align the outlier EDAM Sequence with the existing SO hierarchy, and pull everything together so that the terms are grouped nicely for searching FAIRsharing.
This ticket is also related to #61, which describes refactoring variant to align with the SO hierarchy already in place.
Reasoning: This is the label currently used in DRAO. Please note region is the name SO uses for sequence (indeed, sequence is one of its synonyms). However, as DRAO is a more generic AO, we cannot make use of region as it could imply geographic region, astronomical region or any other kind of region. Therefore the region label should be removed upon creation of the release files by adding http://purl.obolibrary.org/obo/SO_0000001 to filter-labels.txt
Recommendation: Amino acid sequence
Reasoning: This is the label currently used in DRAO.
Please note we will retain the EDAM label Protein sequence via DRAO-manual.owl so that users will continue to be able to get to this term via that string.
IRI
Recommendation: http://purl.obolibrary.org/obo/SO_0000839Reasoning: Because we already use SO for most of our sequence-related terms, and we already have this class in DRAO, just invisible to FAIRsharing, so it is the simplest way to refactor the EDAM class.
Definition
Happy with SO definition, though it is a little formal.
Hierarchy
Recommendation: Retain existing hierarchy
Nucleic Acid Sequence, DNA sequence, and RNA sequence
Mapping
Retain existing IRIs, definitions and labels.
Hierarchy
Recommendation: Place Nucleic Acid Sequence as a child of _biologicalregion
Reasoning: There are no exact matches to these three terms in SO, and they are very useful within FAIRsharing so we should retain them, but in the proper sequence hierarchy that we have imported from SO.
What are you changing?
Sequence and its children are currently EDAM terms within the Data hierarchy as shown here:![sequence](https://user-images.githubusercontent.com/143586/88674228-76593c00-d0e1-11ea-8f81-81b1d962b2de.png)
We plan to refactor these classes and move them to a different place in the hierarchy. Details of each mapping are in the Mapping section below. SO region is already present in DRAO, but it is not visible to FAIRsharing users as it does not have the inSubset="FAIRsharing" flag set. Here is the current view of that portion of DRAO:![soseqfeat](https://user-images.githubusercontent.com/143586/88675346-c258b080-d0e2-11ea-9271-5da3ba67892f.png)
So we will need to:
The overall hierarchy will be as follows, with detailed explanations of why each position/IRI was chosen further down.
Why are you suggesting this change?
As you can see, the _sequencefeature hierarchy is much more detailed than what we have for EDAM Sequence, and so it makes ontological sense to align the outlier EDAM Sequence with the existing SO hierarchy, and pull everything together so that the terms are grouped nicely for searching FAIRsharing.
This ticket is also related to #61, which describes refactoring variant to align with the SO hierarchy already in place.
Sequence / region
Mapping
Label
Recommendation: Sequence, and delete region label
Reasoning: This is the label currently used in DRAO. Please note region is the name SO uses for sequence (indeed, sequence is one of its synonyms). However, as DRAO is a more generic AO, we cannot make use of region as it could imply geographic region, astronomical region or any other kind of region. Therefore the region label should be removed upon creation of the release files by adding http://purl.obolibrary.org/obo/SO_0000001 to filter-labels.txt
IRI
Recommendation: http://purl.obolibrary.org/obo/SO_0000001 Reasoning: Because we already use SO for most of our sequence-related terms.
Definition
Happy with SO definition.
Hierarchy
Recommendation: Retain existing hierarchy
polypeptide_region / Protein sequence / Amino acid sequence
Mapping
Label
Recommendation: Amino acid sequence Reasoning: This is the label currently used in DRAO.
Please note we will retain the EDAM label Protein sequence via
DRAO-manual.owl
so that users will continue to be able to get to this term via that string.IRI
Recommendation: http://purl.obolibrary.org/obo/SO_0000839 Reasoning: Because we already use SO for most of our sequence-related terms, and we already have this class in DRAO, just invisible to FAIRsharing, so it is the simplest way to refactor the EDAM class.
Definition
Happy with SO definition, though it is a little formal.
Hierarchy
Recommendation: Retain existing hierarchy
Nucleic Acid Sequence, DNA sequence, and RNA sequence
Mapping
Retain existing IRIs, definitions and labels.
Hierarchy
Recommendation: Place Nucleic Acid Sequence as a child of _biologicalregion
Reasoning: There are no exact matches to these three terms in SO, and they are very useful within FAIRsharing so we should retain them, but in the proper sequence hierarchy that we have imported from SO.