RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
34 stars 9 forks source link

Build KG2.8.4 #312

Open ecwood opened 1 year ago

ecwood commented 1 year ago
1. Build and load KG2:

Example Cypher to get versions of many of the knowledge sources in a specific build of KG2pre:

match (n:`biolink:RetrievalSource`) where not n.id =~ 'umls_.*' and not n.id =~ 'OBO:.*' return n.id, n.name order by n.id;

We're not planning to start this build immediately, I just want to have an issue to start tagging things with.

ecwood commented 11 months ago

This error occurred during the build:

[Thu Jul 20 19:34:20 2023]
Error in rule SemMedDB:
    jobid: 30
    output: /home/ubuntu/kg2-build/semmeddb/kg2-semmeddb-tuplelist.json, /home/ubuntu/kg2-build/semmed-exclude-list.yaml
    log: /home/ubuntu/kg2-build/extract-semmeddb.log (check log file(s) for error message)
    shell:
        bash -x /home/ubuntu/kg2-code/extract-semmeddb.sh /home/ubuntu/kg2-build/semmeddb/kg2-semmeddb-tuplelist.json /home/ubuntu/kg2-build/semmed-exclude-list.yaml  > /home/ubuntu/kg2-build/extract-semmeddb.log 2>&1
        (exited with non-zero exit code)

It is covered in more detail here: https://github.com/RTXteam/RTX-KG2/issues/294#issuecomment-1644489313

ecwood commented 11 months ago

There was an error with SMPDB, which doesn't make sense since it was recently tested and worked the first time around:

[Thu Jul 20 21:02:58 2023]
Error in rule SMPDB:
    jobid: 38
    output: /home/ubuntu/kg2-build/smpdb/pathbank_pathways.csv
    log: /home/ubuntu/kg2-build/extract-smpdb.log (check log file(s) for error message)
    shell:
        bash -x /home/ubuntu/kg2-code/extract-smpdb.sh /home/ubuntu/kg2-build/smpdb > /home/ubuntu/kg2-build/extract-smpdb.log 2>&1
        (exited with non-zero exit code)
ubuntu@ip-172-31-62-73:~/kg2-build$ curl -L -f -k /home/ubuntu/kg2-build/smpdb/ https://pathbank.org/downloads/pathbank_all_pwml.zip
curl: (3) URL using bad/illegal format or missing URL
Warning: Binary output can mess up your terminal. Use "--output -" to tell 
Warning: curl to output it to your terminal anyway, or consider "--output 
Warning: <FILE>" to save to a file.

I am going to try restarting to see if that fixes it.

ecwood commented 11 months ago

An error occurred in Ontologies and TTL, due to #303:

Traceback (most recent call last):
  File "/home/ubuntu/kg2-code/multi_ont_to_json_kg.py", line 1391, in <module>
    save_pickle)
  File "/home/ubuntu/kg2-code/multi_ont_to_json_kg.py", line 142, in make_kg2
    assert os.path.exists(ont_source_info_dict['file']), local_file_name
AssertionError: foodon.owl

download in ont-load-inventory.yaml needed to be set to true to trigger the right if statement to use the pickle file. A better solution should be used in the future.

ecwood commented 11 months ago

Here were the major report changes:

There was a significant drop in edges from infores:fma-umls in this build. The count dropped from 368827 to 290475
There was a significant drop in edges from infores:go in this build. The count dropped from 202983 to 132359
There was a significant drop in edges from infores:hgnc in this build. The count dropped from 42515 to 23262
There are no edges from infores:loinc-umls in this build. There were 2690586 in the previous build.
There was a significant drop in edges from infores:ncbi-taxon in this build. The count dropped from 3971385 to 1393247
There are no edges from infores:vandf-umls in this build. There were 140078 in the previous build.