SuLab / GeneWikiCentral

GeneWiki Organization
MIT License
5 stars 2 forks source link

only create aliases for EXACT synonyms #130

Open andrewsu opened 4 years ago

andrewsu commented 4 years ago

per this comment:

The alias field should only contain exact synonyms, I hope we agree. GO's synonym field also contains NARROW, BROAD, and RELATED ones, besides EXACT. Unfortunately the bot adds all of them, also Wikipedia people added more unrelated stuff. I will now start purging all GO items of any alias that is not an EXACT GO synonym. But I ask you to please STOP re-adding any inexact ones again. --SCIdude (talk) 07:15, 11 December 2019 (UTC)

Seems like a good suggestion...

Confirmed that the different types of aliases are available in the source OWL file we use. For example, from the go.owl file:

    <!-- http://purl.obolibrary.org/obo/GO_0000390 -->

    <owl:Class rdf:about="http://purl.obolibrary.org/obo/GO_0000390">
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/GO_0032988"/>
        <rdfs:subClassOf>
            <owl:Restriction>
                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000050"/>
                <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/GO_0000398"/>
            </owl:Restriction>
        </rdfs:subClassOf>
        <obo:IAO_0000115 rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Disassembly of a spliceosomal complex with the ATP-dependent release of the product RNAs, one of which is composed of the joined exons. In cis splicing, the other product is the excised sequence, often a single intron, in a lariat structure.</obo:IAO_0000115>
        <oboInOwl:hasAlternativeId rdf:datatype="http://www.w3.org/2001/XMLSchema#string">GO:0000391</oboInOwl:hasAlternativeId>
        <oboInOwl:hasAlternativeId rdf:datatype="http://www.w3.org/2001/XMLSchema#string">GO:0000392</oboInOwl:hasAlternativeId>
        <oboInOwl:hasBroadSynonym rdf:datatype="http://www.w3.org/2001/XMLSchema#string">spliceosome disassembly</oboInOwl:hasBroadSynonym>
        <oboInOwl:hasExactSynonym rdf:datatype="http://www.w3.org/2001/XMLSchema#string">spliceosome complex disassembly</oboInOwl:hasExactSynonym>
        <oboInOwl:hasNarrowSynonym rdf:datatype="http://www.w3.org/2001/XMLSchema#string">U12-type spliceosome disassembly</oboInOwl:hasNarrowSynonym>
        <oboInOwl:hasNarrowSynonym rdf:datatype="http://www.w3.org/2001/XMLSchema#string">U2-type spliceosome disassembly</oboInOwl:hasNarrowSynonym>
        <oboInOwl:hasOBONamespace rdf:datatype="http://www.w3.org/2001/XMLSchema#string">biological_process</oboInOwl:hasOBONamespace>
        <oboInOwl:id rdf:datatype="http://www.w3.org/2001/XMLSchema#string">GO:0000390</oboInOwl:id>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">spliceosomal complex disassembly</rdfs:label>
    </owl:Class>
    <owl:Axiom>
        <owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/GO_0000390"/>
        <owl:annotatedProperty rdf:resource="http://purl.obolibrary.org/obo/IAO_0000115"/>
        <owl:annotatedTarget rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Disassembly of a spliceosomal complex with the ATP-dependent release of the product RNAs, one of which is composed of the joined exons. In cis splicing, the other product is the excised sequence, often a single intron, in a lariat structure.</owl:annotatedTarget>
        <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">GOC:krc</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ISBN:0879695897</oboInOwl:hasDbXref>
    </owl:Axiom>
rwst commented 4 years ago

As I don't maintain a bot I tried to remove the inexact aliases using QS but removing aliases is not possible. So, would it be possible to include that with one of your normal runs?

rwst commented 4 years ago

I am now maintaining a bot and am looking into the removal of inexact aliases now.