RTXteam / RTX

Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
https://arax.ncats.io/
MIT License
34 stars 20 forks source link

KG2.9.0c rollout #2259

Closed sundareswarpullela closed 3 weeks ago

sundareswarpullela commented 3 months ago

THE BRANCH FOR THIS ROLLOUT IS: kg2.9.0c THE ARAX-DATABASES.RTX.AI DIRECTORY FOR THIS ROLLOUT IS: /home/rtxconfig/KG2.9.0 Instance used to create the build: buildkg2c.rtx.ai

Prerequisites

ssh access

To complete this workflow, you will need ssh access to:

GitHub access
AWS access

You will need:

Slack workspaces

You will also need access to the following Slack workspaces:

Example ssh config for setting up login into arax.ncats.io:

Host arax.ncats.io
    User stephenr
    ProxyCommand ssh -i ~/.ssh/id_rsa_long -W %h:%p stephenr@35.87.194.254
    IdentityFile ~/.ssh/id_rsa_long
    Hostname 172.31.53.16

1. Build and load KG2c:

2. Rebuild downstream databases:

The following databases should be rebuilt and copies of them should be put in /home/rtxconfig/KG2.X.Y on arax-databases.rtx.ai. Please use this kind of naming format: mydatabase_v1.0_KG2.X.Y.sqlite.

NOTE: As databases are rebuilt, RTX/code/config_dbs.json will need to be updated to point to their new paths! Push these changes to the branch for this KG2 version, unless the rollout of this KG2 version has already occurred, in which case you should push to master (but first follow the steps described here).

3. Update the ARAX codebase:

All code changes should go in the branch for this KG2 version!

4. Pre-upload databases:

Before rolling out, we need to pre-upload the new databases (referenced in config_dbs.json) to arax.ncats.io and the ITRB SFTP server. These steps can be done well in advance of the rollout; it doesn't hurt anything to do them early.

5. Rollout new KG2c version to arax.ncats.io development endpoints

6. Final items/clean up:

7. Roll-out to ITRB TEST

8. Roll-out to ITRB PRODUCTION

sundareswarpullela commented 3 months ago

@acevedol, it looks like build node is missing in nodes.tsv file. Can you please check and look into it.

sundareswarpullela commented 2 months ago

Fixed the bug in code that couldn't recognize the build node due to a string formatting issue.

sundareswarpullela commented 2 months ago

KG2.9.0c build underway

saramsey commented 2 months ago

pytest results are failing:

test_ARAX_filter_results.py::test_n_results
test_ARAX_filter_results.py::test_warning
test_ARAX_filter_results.py::test_sort_by_node_attribute
test_ARAX_filter_results.py::test_sort_by_score

Relatedly, the following cypher command in KG2.8.4c returns 12 edges, but 0 edges in KG2.9.0c:

MATCH (n {id:'MONDO:0002967'})-[r]-(m {category: 'biolink:ChemicalEntity'}) return r.primary_knowledge_source, r.kg2_ids, m.id;

(note, MONDO:0002967 is "tinea capitis").

sundareswarpullela commented 2 months ago

The above pytests were failing due to sri-node-normalizer changing something where all nodes of tinea capitis were not being synonymized correctly, leading to duplicate nodes. Pytests have been updated to reflect this change and are now passing.

saramsey commented 3 weeks ago

@sundareswarpullela can this issue be closed out?

sundareswarpullela commented 3 weeks ago

Yes, closing this issue.