Knowledge-Graph-Hub / kg-idg

A Knowledge Graph to Illuminate the Druggable Genome
https://knowledge-graph-hub.github.io/kg-idg/
BSD 3-Clause "New" or "Revised" License
9 stars 2 forks source link

Orphanet transform fails due to missing `ExternalReference` #131

Open caufieldjh opened 10 months ago

caufieldjh commented 10 months ago

Describe the bug

On the most recent KG build, the transform stage fails due to an issue in the Orphanet transform.

Stack trace:

[2024-01-01T18:36:11.453Z] [2024-01-01 10:36:11][INFO   ][root    ] Parsing OrphanetTransform
[2024-01-01T18:36:37.854Z] Parsing data/raw/orphanet_gene.xml to JSON...
[2024-01-01T18:36:37.854Z] Transforming using source in kg_idg/transform_utils/orphanet/orphanet_gene.yaml
[2024-01-01T18:36:37.854Z] [2024-01-01 10:36:35][INFO   ][koza.app] Transforming source: orphanet_gene
[2024-01-01T18:36:41.976Z] Traceback (most recent call last):
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/run.py", line 167, in <module>
[2024-01-01T18:36:41.976Z]     cli()
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/venv/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
[2024-01-01T18:36:41.976Z]     return self.main(*args, **kwargs)
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/venv/lib/python3.9/site-packages/click/core.py", line 1078, in main
[2024-01-01T18:36:41.976Z]     rv = self.invoke(ctx)
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/venv/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
[2024-01-01T18:36:41.976Z]     return _process_result(sub_ctx.command.invoke(sub_ctx))
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/venv/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
[2024-01-01T18:36:41.976Z]     return ctx.invoke(self.callback, **ctx.params)
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/venv/lib/python3.9/site-packages/click/core.py", line 783, in invoke
[2024-01-01T18:36:41.976Z]     return __callback(*args, **kwargs)
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/run.py", line 66, in transform
[2024-01-01T18:36:41.976Z]     kg_transform(*args, **kwargs)
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/kg_idg/transform.py", line 60, in transform
[2024-01-01T18:36:41.976Z]     t.run()
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/kg_idg/transform_utils/orphanet/orphanet.py", line 49, in run
[2024-01-01T18:36:41.976Z]     self.parse(name, entry, name)
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/kg_idg/transform_utils/orphanet/orphanet.py", line 68, in parse
[2024-01-01T18:36:41.976Z]     transform_source(
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/venv/lib/python3.9/site-packages/koza/cli_runner.py", line 84, in transform_source
[2024-01-01T18:36:41.976Z]     source_koza.process_sources()
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/venv/lib/python3.9/site-packages/koza/app.py", line 105, in process_sources
[2024-01-01T18:36:41.976Z]     transform_module = importlib.import_module(transform_code)
[2024-01-01T18:36:41.976Z]   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
[2024-01-01T18:36:41.976Z]     return _bootstrap._gcd_import(name[level:], package, level)
[2024-01-01T18:36:41.976Z]   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
[2024-01-01T18:36:41.976Z]   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
[2024-01-01T18:36:41.976Z]   File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
[2024-01-01T18:36:41.976Z]   File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
[2024-01-01T18:36:41.976Z]   File "<frozen importlib._bootstrap_external>", line 855, in exec_module
[2024-01-01T18:36:41.976Z]   File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
[2024-01-01T18:36:41.976Z]   File "/var/lib/jenkins/workspace/nowledge-graph-hub_kg-idg_master/gitrepo/kg_idg/transform_utils/orphanet/orphanet_gene.py", line 58, in <module>
[2024-01-01T18:36:41.976Z]     all_ex_refs = gene["Gene"]["ExternalReferenceList"]["ExternalReference"]
[2024-01-01T18:36:41.976Z] KeyError: 'ExternalReference'

For reference, the last working build goes like this for this transform:

[2023-12-01T18:36:10.185Z] [2023-12-01 10:36:09][INFO   ][root    ] Parsing OrphanetTransform
[2023-12-01T18:36:36.587Z] Parsing data/raw/orphanet_gene.xml to JSON...
[2023-12-01T18:36:36.587Z] Transforming using source in kg_idg/transform_utils/orphanet/orphanet_gene.yaml
[2023-12-01T18:36:36.587Z] [2023-12-01 10:36:33][INFO   ][koza.app] Transforming source: orphanet_gene
[2023-12-01T18:36:39.815Z] [2023-12-01 10:36:39][INFO   ][koza.io.reader.json_reader] Finished processing 3945 rows for orphanet_gene from file data/raw/orphanet_gene.json