Closed AlanSimmons closed 4 months ago
Not sure your environment, but it's pretty easy to get node in a python environment by using the python package: nodeenv
. Once node 18+ is installed you can run the exporter as a commandline program like this:
npx github:x-atlas-consortia/hra-ubkg-exporter --version v2.3.0 <output dir>
The program takes about 5 minutes (at some point I'll optimize it further) to run.
After further review, we think that it makes more sense to download extracted edge/node files directly from GitHub, as we can be sure that these files are supported.
Yep, makes sense! It only changes once every six months, so then we'll compile and notify you. What is the best way to notify you? PR, GitHub Issue, or you just check when you rebuild anyway (if you do that frequently)?
I think that the best way is to notify me directly after you update the repo. That way, I only ingest the version that you're happy with. I will work with a downloaded copy of the files between updates.
Will do!
FYI @bherr2
Statement of issue
A subset of the HRA ontology is exported into UBKG edges/nodes format via code in the hra-ubkg-exporter repo. The current version of the edges and nodes files can be found in the repo in the HRA folder.
Although the UBKG ETL for HRA currently imports edges and nodes files that are downloaded from the repo, there is a risk that these files may not be current. It may be better to execute the hra-ubkg-exporter scripts directly to generate the latest UBKG files.
Proposed solution
Install Node on the development machine. Modify the ETL so that it calls the hra-ubkg-exporter script to generate edges and nodes files.
Potential challenge
The ETL is a Python script, so it would have to execute the Node script somehow. If this is not feasible, I can run the script manually and store the output files locally.