Closed joeflack4 closed 12 months ago
@matentzn The omim.sssom.tsv
being released since I created this action was old, never being recreated. I just deactivated from including it in the release.
This is currently marked "Low" urgency. Can you increase and let me know if it should be a higher priority?
Just always run everything in ODK - there is no need for custom install procedures! All dependencies you will need are in there.
@matentzn Alright, that makes sense. If you remember, we previously were not able to use ODK in GitHub actions, I think due to memory constraints. I'm going to reach out to @hrshdhgd for this though, as I believe he was somewhat recently able to get it to work.
@matentzn I though Harshad was able to get ODK to work in a GitHub action but he told me that is not the case. I don't know if you remember us speaking about how this was difficult / impossible fore. Do you want me to try? I would imagine this is a lower priority though than many of my other issues. In the interim, it would be much easier for me to implement solution 'a'.
ODK works in hundreds of GHAs! Its very very easy. https://github.com/obophenotype/bio-attribute-ontology/blob/master/.github/workflows/qc.yml#L22
That is great to hear. I'll go ahead and try and use this.
You probably remember we spoke about this a few different times in the past year? Maybe what you meant that everything couldn't be run due to memory issues, like NCIT, not that ODK / certain workflows couldn't be run. Because otherwise I'm sure we would use this in mondo-ingest as well.
Yes, that's right. You cant build mondo ingest in GHA because of memory constraints
@matentzn I got the ODK GitHub action working, but I'm getting a new OBO GRAPH ERROR
. The short of it is that I'd like you to pick one of possible solutions 'a', 'b', or 'c'. 'c' seems the easiest and I think you were also fine with this approach many months ago (not because of error; but because we didn't need it).
OBO GRAPH ERROR Could not convert ontology to OBO Graph (see https://github.com/geneontology/obographs)
For details see: http://robot.obolibrary.org/errors#obo-graph-error
These parts of the log are the only thing indicative of the problem:
ERROR Input ontology contains 7310 triple(s) that could not be parsed:
_:genid-nodeid-node1h3dl19e1x7863 <https://w3id.org/biolink/vocab/has_evidence> Evidence: (3) The molecular basis for the disorder is known; a mutation has been found in the gene..
...
Hint: you have undeclared predicates - try adding 'rdf:type' declarations to the following:
- https://w3id.org/biolink/vocab/has_evidence
Full log: log.txt
I don't understand why it says "undeclared predicates". I have its namespace declared in the header and turtle triple syntax seems correct. I do notice that the namespace https://w3id.org/biolink/vocab/ doesn't resolve to anything machine readable, so it makes sense that no declarations can be pulled from there if robot needs to do that, but I have other biolink predicates (e.g. biolink:category
) that are not throwing this same warning.
Header declaration:
@prefix biolink: <https://w3id.org/biolink/vocab/> .
Example usage:
[] a owl:Axiom ;
rdfs:comment "Evidence: (3) The molecular basis for the disorder is known; a mutation has been found in the gene." ;
owl:annotatedProperty rdfs:subClassOf ;
owl:annotatedSource OMIM:300644 ;
owl:annotatedTarget [ a owl:Restriction ;
owl:onProperty RO:0003302 ;
owl:someValuesFrom OMIM:301500 ] ;
biolink:has_evidence "Evidence: (3) The molecular basis for the disorder is known; a mutation has been found in the gene." .
I'm not sure how.
http://robot.obolibrary.org/errors#obo-graph-error suggests this:
This may be due to problematic annotations, so you can create a subset using filter or remove containing only the necessary annotations and save that as JSON, for example (keeping only labels and definitions):
robot remove --input ont.owl \ --select annotation-properties \ --exclude-term rdfs:label \ --exclude-term IAO:0000115 \ --output ont.json
biolink:has_evidence
I think at some point in the past you suggested we don't need it. Notice also that we are already including an additional evidence comment.
[] a owl:Axiom ;
rdfs:comment "Evidence: (3) The molecular basis for the disorder is known; a mutation has been found in the gene." ;
...
biolink:has_evidence "Evidence: (3) The molecular basis for the disorder is known; a mutation has been found in the gene." .
For every annotation property in your ontology, add a triple:
my:annotationProperty rdf:type owl:AnnotationProperty
and try again, e.g.:
biolink:has_evidence rdf:type owl:AnnotationProperty
This is called a "declaration"
Some serialisation need this. A hack I think that could work is this:
But better to inject declarations from the start into the TTL file.
Ah, ok. Thanks for these solutions! I like the first proposal of simply adding the declaration.
I see now that the reason that it was unhappy about biolink:has_evidence
and not some of the other biolink
preds is because it was used in an annotation property.
@matentzn Hmm... so I gave it a shot, but it's still giving an OBO GRAPH ERROR
, except this time I can't find any useful information in the log or stacktrace.
I added biolink:has_evidence rdf:type owl:AnnotationProperty
and re-ran, and I didn't get any errors/warnings about it anymore, so that's good. But I still got an OBO GRAPH ERROR
, with no indication as to what's causing it.
robot remove
There's no indication in the error message that biolink:has_evidence
has anything to do with it, but the only advice at http://robot.obolibrary.org/errors#obo-graph-error is in regards to problematic annotations, which it recommends removing, so I decided to try this anyway: robot remove --input omim.ttl --exclude-term biolink:has_evidence -o omim.json
. Again, I can't locate any helpful information in the log / stacktrace.
.ofn
I tried the OWL functional syntax idea you recommended: robot convert --input omim.ttl --output omim.ofn && robot convert -vvv -i omim.ofn -o omim.json
. Again nothing helpful in the log.
biolink:has_evidence
in initial .ttl
Just trying random stuff. Tried a run removing biolink:has_evidence
completely from the initial ingest, but still failing with the same error and no apparent useful information in log.
I tried opening omim.ttl
in Protégé, and it initially loaded without error. But then I went to the "Classes" tab, and clicked to expand owl:Thing
to explore it, but the app completely froze and crashed. Same for "Individuals by class". In the "Object properties" tab, and I can expand and see 4 RO
predicates there.
robot
or obographs
repos about this?After 1 hour of debugging I found the problem: https://github.com/geneontology/obographs/issues/100
I am sure you would have wasted many hours if not days on this, so good I checked it. Here is my technique on how to deal with failures where we don't have an intelligible error message:
This way I stumbled across the fix. Terrible technique but works :D
( Solve issue by injecting this into the ontology:
<http://purl.obolibrary.org/obo/mondo/omim.owl> rdf:type owl:Ontology .
)
Those are some good steps. I'll file them in my personal notes.
Definitely would have taken me forever if I was to continue cracking at it.
A simple solution now that you found the problem! Will be a quick fix when I'm back from vacation.
Thanks for opening up an issue about it as well.
Overview
Currently,
omim.sssom.tsv
can't be recreated because this command won't work:sssom parse omim.json -I obographs-json -m data/metadata.sssom.yml -o omim.sssom.tsv
. This command fails because of missingrobot
dependency. I need to figure out how to include arbitrary binaries in a GitHub action.Sub-tasks list
robot
to run #97OBO GRAPH ERROR
Sub-task details
1. Get
robot
to runPossible solutions
a. ~Maybe I can simply download
robot
/robot.jar
and place them in the root of the repo, either committing them, or downloading them at the start of the action.~ b. ~But maybe there's a better practice.~ c. ~Personally, I've always thought the best thing would be to create someonto-robo-py
package that simply is a wrapper aroundrobot.jar
and calls it via a subprocess.~ d. Run the action in ODK (#97) <--------2. OBO GRAPH ERROR
When running
robot convert -i omim.ttl -o omim.json
as a pre-step to doingsssom parse omim.json...
, I get an error, but the cause is not known. I imagine that it has to do with the version ofrobot
being different in ODK than what I was previously using to convert on my local machine. Nothing has changed significantly with the OMIM pipeline in a while.