geneontology / noctua-models

This is the data repository for the models created and edited with the Noctua tool stack for GO.
Creative Commons Attribution 4.0 International
10 stars 3 forks source link

models with malformed taxon IRIs #277

Closed balhoff closed 1 month ago

balhoff commented 1 month ago

I came across some malformed taxon IRIs while working on For example, in model there are taxon IRIs like <taxon:196620> and <taxon:282459>. These are not CURIEs; the taxon prefix is being used as the IRI scheme. They should be, e.g., <>.

We should query for any others like this and do a bulk repair.

balhoff commented 1 month ago
grep -R '<taxon:\d' models
models/MGI_MGI_1098776.ttl: a <> , <taxon:588858> ;
models/MGI_MGI_2429397.ttl: a <> , <taxon:282459> ;
models/MGI_MGI_2429397.ttl: a <> , <taxon:196620> ;
models/MGI_MGI_2429397.ttl:<taxon:196620> a <> .
models/MGI_MGI_2429397.ttl:<taxon:282459> a <> .
models/MGI_MGI_96395.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_96395.ttl:   a <> , <taxon:383379> ;
models/MGI_MGI_87883.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_88232.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_96540.ttl:   a <> , <taxon:383379> ;
models/MGI_MGI_95499.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_95499.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_1346060.ttl:<taxon:1280> a <> .
models/MGI_MGI_1346060.ttl:<taxon:282459> a <> .
models/MGI_MGI_1346060.ttl: a <> , <taxon:1280> ;
models/MGI_MGI_1346060.ttl: a <> , <taxon:1280> ;
models/MGI_MGI_1346060.ttl: a <> , <taxon:282459> ;
models/MGI_MGI_107899.ttl:  a <> , <taxon:1280> ;
models/MGI_MGI_107656.ttl:  a <> , <taxon:383379> ;
models/MGI_MGI_96446.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_104798.ttl:<taxon:1280> a <> .
models/MGI_MGI_104798.ttl:  a <> , <taxon:1280> ;
models/MGI_MGI_108005.ttl:<taxon:1280> a <> .
models/MGI_MGI_108005.ttl:  a <> , <taxon:1280> ;
models/MGI_MGI_2152213.ttl:<taxon:1280> a <> .
models/MGI_MGI_2152213.ttl: a <> , <taxon:1280> ;
models/MGI_MGI_96539.ttl:   a <> , <taxon:383379> ;
models/MGI_MGI_96923.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_96923.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_2679229.ttl: a <> , <taxon:5085> ;
models/MGI_MGI_2679229.ttl: a <> , <taxon:5476> ;
models/MGI_MGI_2679229.ttl: a <> , <taxon:5476> ;
models/MGI_MGI_2679229.ttl: a <> , <taxon:5476> ;
models/MGI_MGI_2686979.ttl: a <> , <taxon:1280> ;
models/MGI_MGI_104797.ttl:  a <> , <taxon:1280> ;
models/MGI_MGI_96924.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_96924.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_98239.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_98239.ttl:   a <> , <taxon:1314> ;
models/MGI_MGI_97283.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_97283.ttl:   a <> , <taxon:5476> ;
models/MGI_MGI_97283.ttl:   a <> , <taxon:5476> ;
models/MGI_MGI_97283.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_97283.ttl:   a <> , <taxon:5476> ;
models/MGI_MGI_97137.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_97137.ttl:   a <> , <taxon:5476> ;
models/MGI_MGI_97137.ttl:   a <> , <taxon:5476> ;
models/MGI_MGI_108443.ttl:  a <> , <taxon:471876> ;
models/MGI_MGI_108443.ttl:  a <> , <taxon:1280> ;
models/MGI_MGI_96617.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_88563.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_88563.ttl:   a <> , <taxon:1280> ;
models/MGI_MGI_88563.ttl:   a <> , <taxon:5085> ;
kltm commented 1 month ago

From @balhoff : may be fixed by:

find . -type f -name "*.ttl" -exec sed -i '' -E 's/<taxon:([[:digit:]]+)>/<http:\/\/\/obo\/NCBITaxon_\1>/g' {} +
kltm commented 1 month ago


find . -type f -name "*.ttl" -exec sed -E -i "s/<taxon:([[:digit:]]+)>/<http:\/\/\/obo\/NCBITaxon_\1>/g" {} +
kltm commented 1 month ago

Models that will be repaired:

    modified:   MGI_MGI_104797.ttl
    modified:   MGI_MGI_104798.ttl
    modified:   MGI_MGI_107656.ttl
    modified:   MGI_MGI_107899.ttl
    modified:   MGI_MGI_108005.ttl
    modified:   MGI_MGI_108443.ttl
    modified:   MGI_MGI_1098776.ttl
    modified:   MGI_MGI_1346060.ttl
    modified:   MGI_MGI_2152213.ttl
    modified:   MGI_MGI_2429397.ttl
    modified:   MGI_MGI_2679229.ttl
    modified:   MGI_MGI_2686979.ttl
    modified:   MGI_MGI_87883.ttl
    modified:   MGI_MGI_88232.ttl
    modified:   MGI_MGI_88563.ttl
    modified:   MGI_MGI_95499.ttl
    modified:   MGI_MGI_96395.ttl
    modified:   MGI_MGI_96446.ttl
    modified:   MGI_MGI_96539.ttl
    modified:   MGI_MGI_96540.ttl
    modified:   MGI_MGI_96617.ttl
    modified:   MGI_MGI_96923.ttl
    modified:   MGI_MGI_96924.ttl
    modified:   MGI_MGI_97137.ttl
    modified:   MGI_MGI_97283.ttl
    modified:   MGI_MGI_98239.ttl
kltm commented 1 month ago

At the very least, committed above.