Knowledge-Graph-Hub / kg-idg

A Knowledge Graph to Illuminate the Druggable Genome
https://knowledge-graph-hub.github.io/kg-idg/
BSD 3-Clause "New" or "Revised" License
9 stars 2 forks source link

Ingest error breaks build #102

Closed caufieldjh closed 1 year ago

caufieldjh commented 2 years ago

Describe the bug

An error occurs during the transform phase of the build:

13:32:36  + python3.8 run.py transform
13:36:57  WARNING:koza.model.config.source_config:Could not load dataset description from metadata file
13:37:05  ERROR:root:Encountered error: could not convert string to float: ''
13:37:05  Parsing data/raw/mondo_kgx_tsv.tar.gz
13:37:05  Parsing data/raw/chebi_kgx_tsv.tar.gz
13:37:05  Parsing data/raw/hp_kgx_tsv.tar.gz
13:37:05  Parsing data/raw/go_kgx_tsv.tar.gz
13:37:05  Parsing data/raw/ogms_kgx_tsv.tar.gz
13:37:05  Parsing data/raw/drug.target.interaction.tsv.gz
13:37:05  Transforming to data/transformed/drug_central using source in kg_idg/transform_utils/drug_central/drugcentral-dti.yaml
13:37:05  koza_apps entry created for: drugcentral-dti
13:37:05  koza_app: <koza.app.KozaApp object at 0x7f5d91d55a90>

This seems to be an error in transforming the drugcentral-dti. Not certain why it's happening now but it may be KGX related, as the most recent change unpinned KGX version.

To Reproduce

See Jenkins build 66. Happened in build 65 too.

Version

67eeb300be7f4471c226ed0fc8ff85b638dbe895

caufieldjh commented 1 year ago

Confirmed that this ingest is problematic:

~/kg-idg$ koza transform --source kg_idg/transform_utils/drug_central/drugcentral-dti.yaml 
WARNING:koza.model.config.source_config:Could not load dataset description from metadata file
INFO:koza.cli_runner:No global table used for transform
koza_apps entry created for: drugcentral-dti
koza_app: <koza.app.KozaApp object at 0x7fa8a2d01490>
INFO:koza.io.reader.csv_reader:headers for drugcentral-dti parsed as ['DRUG_NAME', 'STRUCT_ID', 'TARGET_NAME', 'TARGET_CLASS', 'ACCESSION', 'GENE', 'SWISSPROT', 'ACT_VALUE', 'ACT_UNIT', 'ACT_TYPE', 'ACT_COMMENT', 'ACT_SOURCE', 'RELATION', 'MOA', 'MOA_SOURCE', 'ACT_SOURCE_URL', 'MOA_SOURCE_URL', 'ACTION_TYPE', 'TDL', 'ORGANISM']
Traceback (most recent call last):
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/bin/koza", line 8, in <module>
    sys.exit(main.typer_app())
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/typer/main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/typer/main.py", line 532, in wrapper
    return callback(**use_params)  # type: ignore
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/koza/main.py", line 59, in transform
    transform_source(
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/koza/cli_runner.py", line 77, in transform_source
    source_koza.process_sources()
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/koza/app.py", line 107, in process_sources
    importlib.reload(transform_module)
  File "/usr/lib/python3.9/importlib/__init__.py", line 169, in reload
    _bootstrap._exec(spec, module)
  File "<frozen importlib._bootstrap>", line 613, in _exec
  File "<frozen importlib._bootstrap_external>", line 855, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/home/harry/kg-idg/kg_idg/transform_utils/drug_central/drugcentral-dti.py", line 89, in <module>
    quantity = QuantityValue(has_numeric_value=row["ACT_VALUE"], has_unit=row["ACT_TYPE"])
  File "<string>", line 5, in __init__
  File "/home/harry/.cache/pypoetry/virtualenvs/kg-idg-u8BbhwpW-py3.9/lib/python3.9/site-packages/biolink/model.py", line 1353, in __post_init__
    self.has_numeric_value = float(self.has_numeric_value)
ValueError: could not convert string to float: ''
caufieldjh commented 1 year ago

I think some of the values for ACT_VALUE are just empty, maybe even if there's a value for ACT_TYPE. Try a try.