geneontology / noctua

Graph-based modeling environment for biology, including prototype editor and services
http://noctua.geneontology.org/
BSD 3-Clause "New" or "Revised" License
37 stars 12 forks source link

Disappearing URS #902

Open hattrill opened 3 months ago

hattrill commented 3 months ago

I have two models in which RNAcentral IDs were used - they are now showing "GP info not available" in the list view or in the pathway editor but I can see them in the form editor. Both IDs are current and they were 'present and stable' the last time I looked at these models a few weeks ago.

Model ID 654d809000000802 gp: URS000012F9EC_9606

Model ID 66187e4700001274 gp: URS000075C624_9606 Screenshot 2024-07-05 at 12 49 24

Screenshot 2024-07-05 at 10 54 11 Screenshot 2024-07-05 at 12 52 04

kltm commented 3 months ago

Okay, to unpack this a little, this entity still seems to be showing up, at least in a test build I did locally and in NEO (http://noctua-amigo.berkeleybop.org/amigo/term/URS000012F9EC_9606). It could just be a sync issue where these temporarily dropped out and have been re-added (assuming that https://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/gpi/rnacentral.gpi.gz is the right upstream). I think it might be best to wait until the next noctua outage and see if this has "improved".

All that said, I think that there is still discussion on how to handle RNAcental (even putting aside that the actual information present is scarcely more informative than null information, see https://github.com/geneontology/neo/issues/99). The Makefile is also a little unclear, and the more time I spend poking around in there the more I want to do some cleaning.

hattrill commented 3 months ago

Thanks @kltm will check in again after next outage.

hattrill commented 1 month ago

@kltm this is still not resolved. Could you give it a look over?

kltm commented 1 month ago

@hattrill Okay, looking at the link again, it does seem to "be there", but is missing a namespace. This means that either the GPI/GAF that this information is coming from is messed up, or that the processing of the same is messed up. I'll need to look into this a bit more.

kltm commented 1 month ago

Origin likely target/neo-goa_human_rna.obo by mirror/goa_human_rna.gpi.gz from https://ftp.ebi.ac.uk/pub/databases/GO/goa/HUMAN/goa_human_rna.gpi.gz line

RNAcentral      URS000012F9EC_9606              Homo sapiens (human) hsa-miR-4691-3p            miRNA   taxon:9606                      

Hm...this looks right. Checking processing next.

Trying to duplicate with

make clean && TEST_SRCS=goa_human_rna make test_obo 2>&1 | tee /tmp/log.txt and make clean && PATH=$PATH:/home/sjcarbon/local/src/git/owltools/OWLTools-Runner/bin/ SRCS=goa_human_rna make all

but it looks good locally. Cleaning workspace and retrying.

kltm commented 1 month ago

Also checking target/neo-rnacentral.obo; it looks okay too.

Re-running in the side pipeline, I can confirm that the issue persists.

kltm commented 1 month ago

I can confirm that we have it in target/neo-goa_human_rna.obo:id: RNAcentral:URS000012F9EC_9606 but it disappears by the time that we get to neo.obo:id: URS000012F9EC_9606

This is beginning to look like maybe an owltools issue?

I believe I have traced it to this one: owltools --create-ontology http://purl.obolibrary.org/obo/go/noctua/neo.owl target/neo-goa_human_rna.obo imports/pr_import.obo --merge-support-ontologies -o -f obo neo.obo.tmp && grep -v ^owl-axioms neo.obo.tmp > neo.obo

@balhoff I don't suppose you have an instict for what may go off the rails here, before I dig in? Can we replace with robot maybe instead?

balhoff commented 1 month ago

@kltm I know what's going on. It's due to the OBO prefixes support, which has been a lot more trouble than I expected. It's especially bad here because the NEO build is full of prefix hacks. The new prefix support would actually work really well for this, but not sure how much revamping we want to do. We can either embrace it and use the newest ROBOT, or revert to an earlier build of owltools. Or actually an earlier build of ROBOT would work as well, just need to update the command lines.

kltm commented 1 month ago

@balhoff Hm. While I am game for any of the above that fixes the problem, my druthers would be starting with things that move us "forward", if not too spendy timewise. I'm happy to give some robot commands a try to see if they give us the results that we want. For context, the base command in the Makefile is:

neo.obo:  $(OBO_SRCS) $(IMPORTS)
        owltools --create-ontology http://purl.obolibrary.org/obo/go/noctua/neo.owl $^ --merge-support-ontologies  -o -f obo $@.tmp && grep -v ^owl-axioms $@.tmp > $@

with OBO_SRCS (and so $^) being defined as a loooooong list of obos.

kltm commented 1 month ago

@balhoff I tried to roll my own, but it didn't work like I'd expect:

ROBOT_JAVA_ARGS=-Xmx256G ~/local/src/git/robot/bin/robot annotate --ontology-iri http://purl.obolibrary.org/obo/go/noctua/neo.owl --input target/neo-goa_human_rna.obo --output /tmp/neo.owl
ROBOT_JAVA_ARGS=-Xmx256G ~/local/src/git/robot/bin/robot convert --input /tmp/neo.owl --output /tmp/neo.obo

I still get

[Term]
id: URS0000000096_9606

I this because I haven't used the "prefix support" you mentioned above?

balhoff commented 1 month ago

@kltm here is what I had in mind: https://github.com/geneontology/neo/pull/119

kltm commented 1 month ago

@balhoff Great--I see what's going on there. I guess it's still WIP, but that's an exciting development!

hattrill commented 1 week ago

Can we get this fixed? Would be nice to show functional models to RNAcentral!

kltm commented 1 week ago

@hattrill this is in progress over here: https://github.com/geneontology/neo/pull/121

kltm commented 1 week ago

Changes from geneontology/neo#121 require robot in the pipeline.

kltm commented 1 week ago

@balhoff Will be looking at a little more work needed for handling some pipes in the data stream. This will hopefully be completed before the next outage.

kltm commented 1 week ago

@balhoff put in a couple more change and, with a couple of pipeline adjustments, the build is running through again.

@vanaukenk We should consider what we think qualifies as testing and how we do it. The spectrum runs from "check when we roll out next time" to "make a new amigo instance and compare".