ncbo / BioPortal-to-KGX

Assemble a BioPortal Knowledge Graph
BSD 3-Clause "New" or "Revised" License
4 stars 1 forks source link

Error in relaxing Cell Cycle Ontology, BERO, and PR #79

Closed caufieldjh closed 1 year ago

caufieldjh commented 2 years ago

The process of transforming CCO (Cell Cycle Ontology) goes like this:

Starting on ../Bioportal/4store-export-2022-07-20/data/f0/ff/fbb225ff0f97d7737e854fd2d48d
BioPortal metadata not found for CCO_6 - will retrieve.
Accessing https://data.bioontology.org/ontologies/CCO/...
<Response [200]>
Accessing https://data.bioontology.org/ontologies/CCO/latest_submission...
<Response [200]>
Retrieved metadata for CCO (Cell Cycle Ontology)
ROBOT: relax CCO_6
Relaxing /tmp/tmpxpybbnaf to transformed/ontologies/CCO/CCO_6_relaxed.json...
Traceback (most recent call last):
  File "/home/harry/BioPortal-to-KGX/run.py", line 138, in <module>
    run()
  File "/home/harry/.cache/pypoetry/virtualenvs/bioportal-to-kgx-BG6p1jeu-py3.9/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/harry/.cache/pypoetry/virtualenvs/bioportal-to-kgx-BG6p1jeu-py3.9/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/harry/.cache/pypoetry/virtualenvs/bioportal-to-kgx-BG6p1jeu-py3.9/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/harry/.cache/pypoetry/virtualenvs/bioportal-to-kgx-BG6p1jeu-py3.9/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/harry/BioPortal-to-KGX/run.py", line 113, in run
    transform_status = do_transforms(
  File "/home/harry/BioPortal-to-KGX/bioportal_to_kgx/functions.py", line 233, in do_transforms
    if relax_ontology(robot_path,
  File "/home/harry/BioPortal-to-KGX/bioportal_to_kgx/robot_utils.py", line 65, in relax_ontology
    robot_command(
  File "/home/harry/.cache/pypoetry/virtualenvs/bioportal-to-kgx-BG6p1jeu-py3.9/lib/python3.9/site-packages/sh.py", line 1524, in __call__
    return RunningCommand(cmd, call_args, stdin, stdout, stderr)
  File "/home/harry/.cache/pypoetry/virtualenvs/bioportal-to-kgx-BG6p1jeu-py3.9/lib/python3.9/site-packages/sh.py", line 788, in __init__
    self.wait()
  File "/home/harry/.cache/pypoetry/virtualenvs/bioportal-to-kgx-BG6p1jeu-py3.9/lib/python3.9/site-packages/sh.py", line 845, in wait
    self.handle_command_exit_code(exit_code)
  File "/home/harry/.cache/pypoetry/virtualenvs/bioportal-to-kgx-BG6p1jeu-py3.9/lib/python3.9/site-packages/sh.py", line 869, in handle_command_exit_code
    raise exc
sh.SignalException_SIGKILL: 

  RAN: /home/harry/BioPortal-to-KGX/robot relax --input /tmp/tmpxpybbnaf --output transformed/ontologies/CCO/CCO_6_relaxed.json --vvv

  STDOUT:
2022-10-25 19:26:23,373 DEBUG org.obolibrary.robot.IOHelper - Loading ontology /tmp/tmpxpybbnaf with catalog file null
2022-10-25 19:26:23,374 DEBUG org.semanticweb.owlapi.utilities.Injector - Loading file META-INF/services/org.semanticweb.owlapi.model.OWLOntologyManager
2022-10-25 19:26:23,374 DEBUG org.semanticweb.owlapi.utilities.Injector - Loading URL for service jar:file:/home/harry/BioPortal-to-KGX/robot.jar!/META-INF/services/org.semanticweb.owlapi.model.OWLOntologyManager
2022-10-25 19:26:23,374 DEBUG org.semanticweb.owlapi.utilities.Injector - Loading URL for service jar:file:/home/harry/BioPortal-to-KGX/robot.jar!/META-INF/services/org.semanticweb.owlapi.model.OWLOntologyManager
2022-10-25 19:26:23,374 DEBUG org.semanticweb.owlapi... (257202 more, please see e.stdout)

  STDERR:

An unrelated but still present issue: robot's verbosity flag is -vvv, not --vvv.

Run the robot command on its own, and this happens:

[many debug lines later]
DEBUG Saving ontology as OboGraphs JSON Syntax with to IRI file:/home/harry/BioPortal-to-KGX/transformed/ontologies/CCO/CCO_6_relaxed.json
OBO GRAPH ERROR Could not convert ontology to OBO Graph (see https://github.com/geneontology/obographs)
For details see: http://robot.obolibrary.org/errors#obo-graph-error
java.io.IOException: errors#OBO GRAPH ERROR Could not convert ontology to OBO Graph (see https://github.com/geneontology/obographs)
        at org.obolibrary.robot.IOHelper.saveOntologyFile(IOHelper.java:1722)
        at org.obolibrary.robot.IOHelper.saveOntology(IOHelper.java:846)
        at org.obolibrary.robot.CommandLineHelper.maybeSaveOutput(CommandLineHelper.java:667)
        at org.obolibrary.robot.RelaxCommand.execute(RelaxCommand.java:113)
        at org.obolibrary.robot.CommandManager.executeCommand(CommandManager.java:244)
        at org.obolibrary.robot.CommandManager.execute(CommandManager.java:188)
        at org.obolibrary.robot.CommandManager.main(CommandManager.java:135)
        at org.obolibrary.robot.CommandLineInterface.main(CommandLineInterface.java:61)

This ontology hasn't been updated in 8 years so it may be skippable.

caufieldjh commented 2 years ago

A very similar error happens when relaxing BERO (Biological and Environmental Research Ontology).

caufieldjh commented 2 years ago

Same for PR, oddly enough (because that hasn't posed an issue in the past, besides #68 ).

caufieldjh commented 2 years ago

Also NCBITAXON but that could be a memory issue?

caufieldjh commented 1 year ago

There was a similar issue before (#15) but that was pretty explicitly due to null values in comment strings. It's pretty clear that something isn't converting to OBO graph cleanly, and one possibility is that a more recent version of ROBOT handles the remove command differently now so the previously defined process for attempting to fix this no longer works as expected. This does not work, however (same input ontology as the first example above, for CCO):

robot remove --input /tmp/tmpxpybbnaf --select annotation-properties --exclude-term rdfs:label --exclude-term IAO:0000115 --output transformed/ontologies/CCO/CCO_6_fixed.json -vvv

It still yields the following:

DEBUG Saving ontology as OboGraphs JSON Syntax with to IRI file:/home/harry/BioPortal-to-KGX/transformed/ontologies/CCO/CCO_6_fixed.json
OBO GRAPH ERROR Could not convert ontology to OBO Graph (see https://github.com/geneontology/obographs)
For details see: http://robot.obolibrary.org/errors#obo-graph-error
java.io.IOException: errors#OBO GRAPH ERROR Could not convert ontology to OBO Graph (see https://github.com/geneontology/obographs)
        at org.obolibrary.robot.IOHelper.saveOntologyFile(IOHelper.java:1722)
        at org.obolibrary.robot.IOHelper.saveOntology(IOHelper.java:846)
        at org.obolibrary.robot.CommandLineHelper.maybeSaveOutput(CommandLineHelper.java:667)
        at org.obolibrary.robot.RemoveCommand.execute(RemoveCommand.java:202)
        at org.obolibrary.robot.CommandManager.executeCommand(CommandManager.java:244)
        at org.obolibrary.robot.CommandManager.execute(CommandManager.java:188)
        at org.obolibrary.robot.CommandManager.main(CommandManager.java:135)
        at org.obolibrary.robot.CommandLineInterface.main(CommandLineInterface.java:61)
caufieldjh commented 1 year ago

My suspicion is now that this is caused by malformed metadata. Performing the transform from scratch (i.e., removing the temp file and running the following) leads to a successful CCO transform.

$ python run.py --input ~/Bioportal/4store-export-2022-07-20/data/ --kgx_validate --get_bioportal_metadata --ncbo_key [key] --write_curies --include_only fbb225ff0f97d7737e854fd2d48d
Looking for records in /home/harry/Bioportal/4store-export-2022-07-20/data/
Will only include the specified 1 file(s).
1 files found.
Setting up ROBOT...
ROBOT path: /home/harry/BioPortal-to-KGX/robot
ROBOT evironment variables: -Xmx12g -XX:+UseG1GC
Transforming all...
Starting on /home/harry/Bioportal/4store-export-2022-07-20/data/f0/ff/fbb225ff0f97d7737e854fd2d48d
BioPortal metadata not found for CCO_6 - will retrieve.
Accessing https://data.bioontology.org/ontologies/CCO/...
<Response [200]>
Accessing https://data.bioontology.org/ontologies/CCO/latest_submission...
<Response [200]>
Retrieved metadata for CCO (Cell Cycle Ontology)
ROBOT: relax CCO_6
Relaxing /tmp/tmp1pm8qzp3 to transformed/ontologies/CCO/CCO_6_relaxed.json...
Complete.
KGX transform CCO_6
...