geneontology / noctua-models

This is the data repository for the models created and edited with the Noctua tool stack for GO.
http://noctua.geneontology.org/
Creative Commons Attribution 4.0 International
10 stars 3 forks source link

Committed model (models?) has runtime exception due to "unregistered prefix" #23

Closed kltm closed 8 years ago

kltm commented 8 years ago

As with #18, a model has gotten committed that has a prefix error when attempting to read off of disk. The (not super helpful) output can be found here: http://build.berkeleybop.org/job/export-lego-to-legacy/166/console .

For this, until I get something in the client to help block these (and even then, it still may be possible to worm through), it would be good to have a SOP to figure out what went wrong and steps to fix it. In this case, until it's fixed, unless we roll back the repo, the legacy GAF pipeline will be down and we won't be able to restart minerva.

kltm commented 8 years ago

Possibly @cmungall or @balhoff on this. I'll reference the client-side check to this ticket.

kltm commented 8 years ago

Obviously dependent on geneontology/minerva#58.

The pipelines are grinding to a halt: we've just had failures with with http://build.berkeleybop.org/job/load-golr-noctua-neo/40 and http://build.berkeleybop.org/job/load-golr-tomodachi/297 , which means that we may not be able to demo all of our features tomorrow.

ukemi commented 8 years ago

Is this happening because somehow curators are able to enter values that are essentially not valid in some of the fields? I notice that some fields make you pick an autocomplete term, but I don't think all of them do.

kltm commented 8 years ago

Yes. Specifically, it's more that that while some widgets try and make you select an id from the dropdown, it is possible to circumvent it. This will be addressed in geneontology/noctua#366. However, this issue here is to find a of dealing with this once the damage has already been done, as in the current case. (You may recall that something similar happened during the Hinxton training.) In the future, with many different clients and ways of creating models we need both a robust minerva (issue geneontology/minerva#58) and a way of quickly and safely purging bad data.

cmungall commented 8 years ago

Travis should now detect these off the bat with #22 See also @balhoff's quick check for errors of this sort from #18

$ grep '<[^h]' models/*
models/57c82fad00000897:Class: <carA-1>
models/57c82fad00000897:        <carA-1>
models/57ec3a7e00000027:Class: <wbbt:0007833>
models/57ec3a7e00000027:        <wbbt:0007833>

I'll fix these

cmungall commented 8 years ago

@kltm - could the two jenkins jobs be changed to do the pre-check that travis-CI is now doing? That way even if a syntax error does creep in, the worst that should happen is that the golr index is not updated.

cmungall commented 8 years ago

Hurrah:

Build Status

kltm commented 8 years ago

Okay, great! Especially the grep for damaged models. In the future, if I can't figure out how to fix them, I'll yank them and put them into a FIXME folder or something.

I've tried adding the travis test to the noctua load, and if it looks like it's okay, I'll add it to the tomodachi one. However, it's a bit of an icky solution that we should treat as a temporary hack (especially as they are based out of separate repos and required downloading yet other parts from elsewhere--too many parts and we have to remember where we've done it in the past to update).

Ideally, what we need is one of two things in Jenkins. The first, and likely undoable one as we have things running at different frequencies: redo the pipelines to run from a single clock/heartbeat job. The second would be to have jobs test a "status" job for a green light whenever they wanted to start on their own separate frequency. You'd think there would be a plugin for the second, but I haven't found one yet. Will keep looking.

cmungall commented 8 years ago

Shall we start a new ticket for the pipeline refactor?

kltm commented 8 years ago

Blargh. I'm unsure a refactor is really needed right now, unless we'd actually want to use a heartbeat/clock pattern. Really, if the right plugin exists (or groovy trigger for that matter, but plugin preferable), we could probably extend what we have in a fairly sane way.

It might be good to sketch out a desired flowchart of the jobs at some point though. Ideally, there is a direct or indirect control flow from probing for new ontology files all the way out to amigo and loading.