monarch-initiative / monarch-ingest

Data ingest application for Monarch Initiative knowledge graph using Koza
https://monarchinitiative.org
14 stars 1 forks source link

Jenkins build failing, might be space #421

Closed kevinschaper closed 1 year ago

kevinschaper commented 1 year ago

It's not clear what failed from this output:

[Pipeline] stage
[Pipeline] { (denormalize)
[Pipeline] sh
+ poetry run ingest closure
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /tmp/workspace/monarch-ingest-all/monarch_ingest/main.py:88 in closure       │
│                                                                              │
│    85                                                                        │
│    86 @typer_app.command()                                                   │
│    87 def closure():                                                         │
│ ❱  88 │   apply_closure()                                                    │
│    89                                                                        │
│    90 @typer_app.command()                                                   │
│    91 def sqlite():                                                          │
│                                                                              │
│ /tmp/workspace/monarch-ingest-all/monarch_ingest/cli_utils.py:253 in         │
│ apply_closure                                                                │
│                                                                              │
│   250 │   │   output_dir: str = OUTPUT_DIR                                   │
│   251 ):                                                                     │
│   252 │   output_file = f"{output_dir}/{name}-denormalized-edges.tsv"        │
│ ❱ 253 │   add_closure(kg_archive=f"{output_dir}/{name}.tar.gz",              │
│   254 │   │   │   │   closure_file=closure_file,                             │
│   255 │   │   │   │   output_file=output_file)                               │
│   256 │   sh.gzip(output_file)                                               │
│                                                                              │
│ ╭────────────────────────── locals ──────────────────────────╮               │
│ │ closure_file = 'data/monarch/phenio-relation-filtered.tsv' │               │
│ │         name = 'monarch-kg'                                │               │
│ │   output_dir = 'output'                                    │               │
│ │  output_file = 'output/monarch-kg-denormalized-edges.tsv'  │               │
│ ╰────────────────────────────────────────────────────────────╯               │
│                                                                              │
│ /tmp/workspace/monarch-ingest-all/.cache/pypoetry/virtualenvs/monarch-ingest │
│ -_G3LtJ6g-py3.10/lib/python3.10/site-packages/closurizer/closurizer.py:83 in │
│ add_closure                                                                  │
│                                                                              │
│   80 │   │   │   │   │   │   │    rkey="id")                                 │
│   81 │                                                                       │
│   82 │   print("Denormalizing...")                                           │
│ ❱ 83 │   etl.totsv(edges, f"{output_file}")                                  │
│   84 │                                                                       │
│   85 │   # Clean up extracted node & edge files                              │
│   86 │   if os.path.exists(f"{node_file}"):                                  │
│                                                                              │
│ ╭───────────────────────────────── locals ─────────────────────────
[Pipeline] }
[Pipeline] // stage

Given that this is the first build with the full phenio file, I think it could be disk space.

We should try doubling the large instance disk size to 200G to see how that does. https://github.com/monarch-initiative/jenkins-packer-schemas/blob/main/create_instance_templates.py

glass-ships commented 1 year ago

We should write a new ticket to figure out how we'll back up our Jenkins jobs / update Jenkins.

What would be a good repository for that? Do we have a generic "Monarch Operations" or tech team repo or something?