geneontology / noctua-models

This is the data repository for the models created and edited with the Noctua tool stack for GO.
http://noctua.geneontology.org/
Creative Commons Attribution 4.0 International
10 stars 3 forks source link

models with malformed taxon IRIs #277

Closed balhoff closed 1 month ago

balhoff commented 1 month ago

I came across some malformed taxon IRIs while working on https://github.com/geneontology/minerva/issues/545. For example, in model http://noctua.geneontology.org/editor/graph/gomodel:MGI_MGI_2429397 there are taxon IRIs like <taxon:196620> and <taxon:282459>. These are not CURIEs; the taxon prefix is being used as the IRI scheme. They should be, e.g., <http://purl.obolibrary.org/obo/NCBITaxon_196620>.

We should query for any others like this and do a bulk repair.

balhoff commented 1 month ago
grep -R '<taxon:\d' models
models/MGI_MGI_1098776.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:588858> ;
models/MGI_MGI_2429397.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:282459> ;
models/MGI_MGI_2429397.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:196620> ;
models/MGI_MGI_2429397.ttl:<taxon:196620> a <http://www.w3.org/2002/07/owl#Class> .
models/MGI_MGI_2429397.ttl:<taxon:282459> a <http://www.w3.org/2002/07/owl#Class> .
models/MGI_MGI_96395.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_96395.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:383379> ;
models/MGI_MGI_87883.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_88232.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_96540.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:383379> ;
models/MGI_MGI_95499.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_95499.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_1346060.ttl:<taxon:1280> a <http://www.w3.org/2002/07/owl#Class> .
models/MGI_MGI_1346060.ttl:<taxon:282459> a <http://www.w3.org/2002/07/owl#Class> .
models/MGI_MGI_1346060.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_1346060.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_1346060.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:282459> ;
models/MGI_MGI_107899.ttl:  a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_107656.ttl:  a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:383379> ;
models/MGI_MGI_96446.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_104798.ttl:<taxon:1280> a <http://www.w3.org/2002/07/owl#Class> .
models/MGI_MGI_104798.ttl:  a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_108005.ttl:<taxon:1280> a <http://www.w3.org/2002/07/owl#Class> .
models/MGI_MGI_108005.ttl:  a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_2152213.ttl:<taxon:1280> a <http://www.w3.org/2002/07/owl#Class> .
models/MGI_MGI_2152213.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_96539.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:383379> ;
models/MGI_MGI_96923.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_96923.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_2679229.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5085> ;
models/MGI_MGI_2679229.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5476> ;
models/MGI_MGI_2679229.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5476> ;
models/MGI_MGI_2679229.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5476> ;
models/MGI_MGI_2686979.ttl: a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_104797.ttl:  a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_96924.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_96924.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_98239.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_98239.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1314> ;
models/MGI_MGI_97283.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_97283.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5476> ;
models/MGI_MGI_97283.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5476> ;
models/MGI_MGI_97283.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_97283.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5476> ;
models/MGI_MGI_97137.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_97137.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5476> ;
models/MGI_MGI_97137.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5476> ;
models/MGI_MGI_108443.ttl:  a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:471876> ;
models/MGI_MGI_108443.ttl:  a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_96617.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_88563.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_88563.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:1280> ;
models/MGI_MGI_88563.ttl:   a <http://www.w3.org/2002/07/owl#NamedIndividual> , <taxon:5085> ;
kltm commented 1 month ago

From @balhoff : may be fixed by:

find . -type f -name "*.ttl" -exec sed -i '' -E 's/<taxon:([[:digit:]]+)>/<http:\/\/purl.obolibrary.org\/obo\/NCBITaxon_\1>/g' {} +
kltm commented 1 month ago

Trying:

find . -type f -name "*.ttl" -exec sed -E -i "s/<taxon:([[:digit:]]+)>/<http:\/\/purl.obolibrary.org\/obo\/NCBITaxon_\1>/g" {} +
kltm commented 1 month ago

Models that will be repaired:

    modified:   MGI_MGI_104797.ttl
    modified:   MGI_MGI_104798.ttl
    modified:   MGI_MGI_107656.ttl
    modified:   MGI_MGI_107899.ttl
    modified:   MGI_MGI_108005.ttl
    modified:   MGI_MGI_108443.ttl
    modified:   MGI_MGI_1098776.ttl
    modified:   MGI_MGI_1346060.ttl
    modified:   MGI_MGI_2152213.ttl
    modified:   MGI_MGI_2429397.ttl
    modified:   MGI_MGI_2679229.ttl
    modified:   MGI_MGI_2686979.ttl
    modified:   MGI_MGI_87883.ttl
    modified:   MGI_MGI_88232.ttl
    modified:   MGI_MGI_88563.ttl
    modified:   MGI_MGI_95499.ttl
    modified:   MGI_MGI_96395.ttl
    modified:   MGI_MGI_96446.ttl
    modified:   MGI_MGI_96539.ttl
    modified:   MGI_MGI_96540.ttl
    modified:   MGI_MGI_96617.ttl
    modified:   MGI_MGI_96923.ttl
    modified:   MGI_MGI_96924.ttl
    modified:   MGI_MGI_97137.ttl
    modified:   MGI_MGI_97283.ttl
    modified:   MGI_MGI_98239.ttl
kltm commented 1 month ago

At the very least, committed above.