RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
34 stars 9 forks source link

Issue in `Simplify` During `KG2.8.4pre` Build #328

Open ecwood opened 11 months ago

ecwood commented 11 months ago

While running #312, there was an error in Simplify:

[Sun Jul 23 07:24:42 2023]
Error in rule Simplify:
    jobid: 3
    output: /home/ubuntu/kg2-build/kg2-simplified.json
    log: /home/ubuntu/kg2-build/filter_kg_and_remap_predicates.log (check log file(s) for error message)
    shell:
        bash -x /home/ubuntu/kg2-code/run-simplify.sh /home/ubuntu/kg2-build/kg2.json /home/ubuntu/kg2-build/kg2-simplified.json /home/ubuntu/kg2-build/kg2-version.txt  > /home/ubuntu/kg2-build/filter_kg_and_remap_predicates.log 2>&1
        (exited with non-zero exit code)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/ubuntu/.snakemake/log/2023-07-22T174757.268369.snakemake.log
+ /home/ubuntu/kg2-venv/bin/python3 /home/ubuntu/kg2-code/update_version.py --increment_major /home/ubuntu/kg2-build/kg2-version.txt
KG2 version: 2.9.0
+ /home/ubuntu/kg2-venv/bin/python3 -u /home/ubuntu/kg2-code/filter_kg_and_remap_predicates.py --dropNegated --dropSelfEdgesExcept interacts_with,regulates,inhibits,increase /home/ubuntu/kg2-code/predicate-remap.yaml /home/ubuntu/kg2-code/kg2-provided-by-curie-to-infores-curie.yaml /home/ubuntu/kg2-code/curies-to-urls-map.yaml /home/ubuntu/kg2-build/kg2.json /home/ubuntu/kg2-build/kg2-simplified.json /home/ubuntu/kg2-build/kg2-version.txt
/home/ubuntu/kg2-venv/lib/python3.7/site-packages/rdflib_jsonld/__init__.py:12: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.0.  Please remove rdflib-jsonld from your project's dependencies.
  DeprecationWarning,
Traceback (most recent call last):
  File "/home/ubuntu/kg2-code/filter_kg_and_remap_predicates.py", line 319, in <module>
    nodes_dict, knowledge_source_curies_not_in_config_nodes = process_nodes(input_file_name, infores_remap_config)
  File "/home/ubuntu/kg2-code/filter_kg_and_remap_predicates.py", line 137, in process_nodes
    infores_curie_dict = infores_remap_config.get(knowledge_source, None)
TypeError: unhashable type: 'list'

There is also a problem with the version number (#216).

ecwood commented 11 months ago

New error:

+ /home/ubuntu/kg2-venv/bin/python3 -u /home/ubuntu/kg2-code/filter_kg_and_remap_predicates.py --dropNegated --dropSelfEdgesExcept interacts_with,regulates,inhibits,increase /home/ubuntu/kg2-code/predicate-remap.yaml /home/ubuntu/kg2-code/kg2-provided-by-curie-to-infores-curie.yaml /home/ubuntu/kg2-code/curies-to-urls-map.yaml /home/ubuntu/kg2-build/kg2.json /home/ubuntu/kg2-build/kg2-simplified.json /home/ubuntu/kg2-build/kg2-version.txt
/home/ubuntu/kg2-venv/lib/python3.7/site-packages/rdflib_jsonld/__init__.py:12: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.0.  Please remove rdflib-jsonld from your project's dependencies.
  DeprecationWarning,
Traceback (most recent call last):
  File "/home/ubuntu/kg2-code/filter_kg_and_remap_predicates.py", line 322, in <module>
    nodes_dict, knowledge_source_curies_not_in_config_nodes = process_nodes(input_file_name, infores_remap_config)
  File "/home/ubuntu/kg2-code/filter_kg_and_remap_predicates.py", line 140, in process_nodes
    infores_curie = infores_curie_dict['infores_curie']
TypeError: 'NoneType' object is not subscriptable
ecwood commented 11 months ago

New error due to #280:

+ /home/ubuntu/kg2-venv/bin/python3 /home/ubuntu/kg2-code/update_version.py --increment_minor /home/ubuntu/kg2-build/kg2-version.txt
KG2 version: 2.8.4
+ /home/ubuntu/kg2-venv/bin/python3 -u /home/ubuntu/kg2-code/filter_kg_and_remap_predicates.py --dropNegated --dropSelfEdgesExcept interacts_with,regulates,inhibits,increase /home/ubuntu/kg2-code/predicate-remap.yaml /home/ubuntu/kg2-code/kg2-provided-by-curie-to-infores-curie.yaml /home/ubuntu/kg2-code/curies-to-urls-map.yaml /home/ubuntu/kg2-build/kg2.json /home/ubuntu/kg2-build/kg2-simplified.json /home/ubuntu/kg2-build/kg2-version.txt
/home/ubuntu/kg2-venv/lib/python3.7/site-packages/rdflib_jsonld/__init__.py:12: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.0.  Please remove rdflib-jsonld from your project's dependencies.
  DeprecationWarning,
{'umls_source:CHV'}
ecwood commented 11 months ago

Essentially, the big problem is that some nodes provided_by field is a string rather than a list. We should investigate this further at a later date, because this fix is not comprehensive.