Open cmungall opened 5 years ago
Example:
Also weird stuff, the definitions differ but the definition source is the same - surely if a definition is changed substantially then the definition source must change
OK, looking only at logical axioms, I see >1k classes have at least one logical axiom changed. Some of these are not meaningful and may represent redundancies on one or other of the hierarchies, but the majority seem meaningful, see attached
Note a number of IDs seem to have disappeared in SO_refactored, but are still present in MSO?
-id: SO:0000054 ! aneuploid
-id: SO:0000055 ! hyperploid
-id: SO:0000056 ! hypoploid
-id: SO:0000359 ! floxed
-id: SO:0000443 ! polymer attribute
-id: SO:0000628 ! chromosomal structural element
-id: SO:0000687 ! deletion junction
-id: SO:0000733 ! feature attribute
-id: SO:0000782 ! natural
-id: SO:0000784 ! foreign
-id: SO:0000814 ! rescue
-id: SO:0000817 ! wild type
-id: SO:0000831 ! gene member region
-id: SO:0000856 ! conserved
-id: SO:0000857 ! homologous
-id: SO:0000858 ! orthologous
-id: SO:0000859 ! paralogous
-id: SO:0000860 ! syntenic
-id: SO:0000976 ! cryptic
-id: SO:0001004 ! low complexity
-id: SO:0001079 ! polypeptide structural motif
-id: SO:0001234 ! mobile
-id: SO:0001409 ! biomaterial region
-id: SO:0001410 ! experimental feature
-id: SO:0001411 ! biological region
-id: SO:0001412 ! topologically defined region
-id: SO:0001761 ! variant quality
-id: SO:0001769 ! variant phenotype --> variant defined by phenotype
-id: SO:0001814 ! coding variant quality
-id: SO:0001815 ! synonymous
-id: SO:0001816 ! non synonymous
-id: SO:0001992 ! nonsynonymous variant
-id: SO:0100001 ! biochemical region of peptide
-id: SO:0100017 ! polypeptide conserved motif
-id: SO:1000160 ! unoriented insertional duplication --> insertional duplication of unspecified orientation
Re allele, the logical axioms in the refactored MSO/SO were created to properly place the allele class in the new upper-level structuring. I'd argue that asserting allele as a subclass of variant_collection (as it is in the current public SO) is erroneous--even the original natural language definition asserts that it's one of a set of coexisting sequence variants of a gene.
That being said, we do realize that there's more work still to be done, including automatically fixing IDs, synonyms, and natural-language definitions, about which I'm talking with @msinclair2. I'm also going to start another manual review this week to look for errors to be manually fixed.
All the ID annotations have been automatically fixed.
Re classes in the MSO but not SO_refactored: Most of those listed above are SDCs. As I understand the BFO, DCs can't bear DCs themselves, so there's no need for these SDCs in the SO, as the SO sequence entity classes are GDCs.
There are a few sequence entity classes that I recommended for obsoletion, as I thought they were difficult to situate in the refactored upper-level structuring, which I've discussed with Karen and Michael. There may be a few other classes that were inadvertently dropped.
By SDC you mean the quality branch? I can follow your reasoning from a strict bfo POV but you can't just toss out classes in use by many in the SO community because BFO.
On Tue, Feb 26, 2019, 02:19 mikebada notifications@github.com wrote:
Re classes in the MSO but not SO_refactored: Most of those listed above are SDCs. As I understand the BFO, DCs can't bear DCs themselves, so there's no need for these SDCs in the SO, as the SO sequence entity classes are GDCs.
There are a few sequence entity classes that I recommended for obsoletion, as I thought they were difficult to situate in the refactored upper-level structuring, which I've discussed with Karen and Michael. There may be a few other classes that were inadvertently dropped.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/The-Sequence-Ontology/MSO/issues/13#issuecomment-467381618, or mute the thread https://github.com/notifications/unsubscribe-auth/AADGOU56oIIjhy4bfdt0Crfz1PMT1ntjks5vRQoSgaJpZM4bI-f4 .
Yes, I meant the qualities and realizable entities. I've talked about these with Karen a while ago, who as I recall said that these were mostly created for the formal definitions and not used by the sequence annotators. That being said, another thing on the set of tasks still to do was to make sure that all classes that have been actually used in annotations are accounted for.
I assume SO_refactored.owl/obo is the output of the compilation step (see #12)
If so, then it differs substantially from the current released version of SO. Some of these are technical and probably easy to fix; e.g. missing axiom annotations on synonyms. Others involve large changes to the hierarchy. Are all of these intentional, or are some bugs? What is the process for evaluating this and announcing any changes to the community?