Closed cmungall closed 7 years ago
Yes, you can create an IRI in OWL API like "IDA". It will then serialize it without complaint. But it will fail to parse it back in. I'm not sure where best to put a test for this in minerva. Going through JsonOrJsonpBatchHandler.m3Batch
seems to check whether the class IRI is known (regardless of whether it is a valid form). So I think on that path the nefarious class would need to already exist in the tbox.
Could these possibly come in through the "Function companion" tool?
The IDs through most paths should be pretty solid. This seems to happen with a couple of widgets where the user just hits return (or tab) immediately without selecting anything. While it would be nice if minerva would protect itself better in this case (not being able to re-read of disk is pretty bad), we could also implement checks in the client and force the UI to be more consistent in selection.
I think the FC tool should be safe. More likely the ability of curators to paste in their own classes.
Requiring a class be in the tbox is too strict (we have that turned off, as some curators need to paste CURIEs of uniprot classes not loaded). But this would be the right place in the code to check.
On 23 Sep 2016, at 7:33, Jim Balhoff wrote:
Yes, you can create an IRI in OWL API like "IDA". It will then serialize it without complaint. But it will fail to parse it back in. I'm not sure where best to put a test for this in minerva. Going through
JsonOrJsonpBatchHandler.m3Batch
seems to check whether the class IRI is known (regardless of whether it is a valid form). So I think on that path the nefarious class would need to already exist in the tbox.Could these possibly come in through the "Function companion" tool?
You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/geneontology/minerva/issues/58#issuecomment-249209089
@cmungall I guess you mean pasting a class into the "Edit Instance" panel. I see that you can paste in any text there and it will be accepted as an IRI. In file mode the OWL API will save this. In Blazegraph mode the bad ID can be entered, but when you save some Rio errors will be logged, e.g. "Subject URI was invalid". The model will not be saved, but nothing seems to be communicated to the front-end (the bad type will still appear until Minerva is restarted).
Aargh but the error is logged and swallowed in the OWL API so it is reported as a successful save. And if there are other changes it actually will be saved, partially...
IIRC, if you throw the error far enough, it should be picked up by the response handler and get the error back to the client.
Yes—in this case the error is only logged inside OWL API. It is swallowed before we can see it.
I think the right place to check incoming IRIs is at CurieHandler.getIRI
. At a minimum we can prevent relative IRIs completely by making sure the text passed to IRI.create
at least contains a colon (it should have been expanded to http://
... by the curie map). To be even more strict we could verify that the result is a valid java.net.URL
(it will throw an error for unknown protocols). However this would disallow valid identifiers that are not URLs like urn:uuid:12345
. An in-between solution would be to have a small list of valid protocols like http, urn, mailto, and throw an error if the expanded IRI doesn't start with one of these. Would this be missing something?
I'm pretty sure http is only only protocol, but famous last words and all that. I think extending to the protocols you mention should be safe.
General comment: is the CURIE handling worth spinning out into a separate project
On 3 Oct 2016, at 11:59, Jim Balhoff wrote:
I think the right place to check incoming IRIs is at
CurieHandler.getIRI
. At a minimum we can prevent relative IRIs completely by making sure the text passed toIRI.create
at least contains a colon (it should have been expanded tohttp://
... by the curie map). To be even more strict we could verify that the result is a validjava.net.URL
(it will throw an error for unknown protocols). However this would disallow valid identifiers that are not URLs likeurn:uuid:12345
. An in-between solution would be to have a small list of valid protocols like http, urn, mailto, and throw an error if the expanded IRI doesn't start with one of these. Would this be missing something?You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/geneontology/minerva/issues/58#issuecomment-251194286
@balhoff Okay, I'm now testing on minerva at, I believe, a corrected revision: https://github.com/geneontology/minerva/commit/cde48f8595ae06ee5f7fe111752c5d34d8a80f47
I can still add an item, say a gene product for an annoton, using the string "fooble". In the OWL, this comes up as:
Class: <fooble>
It then saves to disk just fine. A restart of minerva works, as well as viewing the model again.
Is this correct behavior?
@kltm I can't reproduce your test. Can you check on the latest version?
I apparently had versioning issues. I can no longer reproduce this when under adult supervision on the production server.
From https://github.com/geneontology/noctua-models/issues/18#issuecomment-248775569
It seems if the client tries to pass in something like
The OWLAPI unquestioningly accepts it, but when it saves it results in invalid OWL
Can we first have a unit test to test this hypothesis. If it turns out to be true we need to address. The best strategy is not clear and may be part of a general improvement in CURIE handling