monarch-initiative / monarch-ingest

Data ingest application for Monarch Initiative knowledge graph using Koza
https://monarchinitiative.org
15 stars 2 forks source link

Create an ingests.yaml, single python command to run all ingests #205

Closed kevinschaper closed 2 years ago

kevinschaper commented 2 years ago

We need our source of truth to list all of the ingests, and rather than enumerating all of the ingests in the Jenkins file - we need to wrap the process to execute them in a single python call.

In addition to the bare minimum of executing, it would be good if the ingests can run with some parallelism, and also as a stretch goal, if we can separate out the logs.

I'm going to take a shot at using Prefect for this, but I'll wrap it in a Typer CLI. I also plan to remove the Dagster pipeline code alongside this change.

kevinschaper commented 2 years ago

With respect to this issue, the output of the merge step should be:

We'll do stats in a different issue after this goes in

kevinschaper commented 2 years ago

Implemented as pipeline.py