NCI-Thesaurus / thesaurus-obo-edition

OBO Library edition of NCIt
22 stars 8 forks source link

Malformed URIs exist in neoplasm-core.owl #21

Closed yy20716 closed 6 years ago

yy20716 commented 6 years ago

I found that there are some curies which are not converted into URIs properly, e.g., there are no prefixes named as urn. Here is the result of grep over those URIs.

yy20716@yy20716-Dell-Precision-M3800:~/eclipse-workspace$ grep "urn:swrl" neoplasm-core.owl -n 1005144: 1005147: 1005150: 1005153: 1005156: 1005159: 1005162: 1005165: 1005168: 1005171: 1005174: ...

balhoff commented 6 years ago

These are the IRIs created by OWL API to stand for SWRL rule variables; they're not actually curies. But you are right, they're not correct syntax. The spec for URN says:

<URN> ::= "urn:" <NID> ":" <NSS>

I'll file a bug with OWL API, because there's no reason these can't be something like urn:swrl:C. Are you using a tool that complained about these? Just curious because they hadn't caused a problem for me before.

yy20716 commented 6 years ago

Oh I see. I found these IRIs while testing with DL-learner because its internal parser complains that those are malformed. These are not currently critical issues for using DL-learner for right now but I just wanted to report these issues so that other people would not see such messages.

balhoff commented 6 years ago

I opened an issue at https://github.com/owlcs/owlapi/issues/732

@yy20716 just curious, could you try changing all these from e.g. urn:swrl#y to urn:swrl:y? I would like to know if that makes DL-Learner happy.

yy20716 commented 6 years ago

I am sorry - I completely forgot this issue due to other tasks I am doing right now. Yes, I just tried and confirmed that changing the IRIs works, i.e. error messages are not produced anymore. Thank you.

balhoff commented 6 years ago

Also, in case you didn't see, this was changed in OWL API. But it may be a while for this change to propagate to various ontologies.

yy20716 commented 6 years ago

I see. I will try again with updated datasets later. Thank you for your kind explanation.