Closed caufieldjh closed 1 month ago
The exact point this happens is here: https://github.com/Knowledge-Graph-Hub/knowledge-graph-hub.github.io/blob/e0f680dc16b5ee89816de63b2f208e2b7301bf7a/utils/make_kg_manifest.py#L243-L254
I haven't been able to pin down exactly where kgx
retrieves the schema, but there is a new version (1.5.6, vs. the 1.5.5 used here) so I'll try that first.
Bumping kgx
to 1.5.6
does not appear to solve this.
Running kgx validate
from command line in a fresh venv on a smaller graph (I tried https://kg-hub.berkeleybop.io/kg-obo/obcs/2018-02-22/obcs_kgx_tsv.tar.gz) does appear to work as expected.
(Also running kgx validate
locally on a larger KG seems to stall for a moment, but completes)
The specific biolink-model version to use is defined in biolink-model-toolkit
:
https://github.com/biolink/biolink-model-toolkit/blob/master/bmt/toolkit.py
This could also be an issue with KG-COVID-19
.
When I do this locally:
$ wget https://kg-hub.berkeleybop.io/kg-covid-19/20200925/kg-covid-19.tar.gz
...
$ kgx validate -i 'tsv' -c 'tar.gz' -o temp-test-kgcovid19 kg-covid-19.tar.gz
kgx
seems to hang (i.e., it still hasn't started node validation after >10 min).
This happens with the most recent version of KG-COVID-19
, too.
On the last successful build, it looks like kgx required about KG-IDG
~16 min between schema loading and beginning node validation:
[2022-03-02T16:04:10.369Z] Loading schema https://w3id.org/linkml/types from https://raw.githubusercontent.com/biolink/biolink-model/2.2.13/biolink-model.yaml
[2022-03-02T16:20:31.627Z] Validating nodes in graph
This is also the case when run locally as
kgx validate -i 'tsv' -c 'tar.gz' -o temp-test-kgidg KG-IDG.tar.gz
KG-IDG
is smaller in size than KG-COVID-19
(205.43M vs 787.17M compressed) but not enough that I'd expect the former to take 16 min and the latter to take days.
All runs, Jenkins or otherwise, hang after this point:
Maybe a
biolink-model
update would help?Originally posted by @caufieldjh in https://github.com/Knowledge-Graph-Hub/knowledge-graph-hub.github.io/issues/16#issuecomment-1066891226