Closed kevinschaper closed 2 years ago
Updated: we can use docker from biolink/kgx:latest
I did a one-off push of a monarch ingest docker too us-central1-docker.pkg.dev/monarch-initiative/monarch-initiative/monarch-ingest:latest but we'll want to have this happening automatically either via Jenkins or preferably a GitHub action
I came back to this last week to work on https://ci.monarchinitiative.org/job/all-ingests/
It spent the weekend stuck on not having enough executors and colliding with another job so that they were both waiting. I increased the executor limit from 2 to 8, so hopefully we won't end up in that stuck condition again.
Currently this job is only running the ingests, not any kind of graph summary (though, for performance reasons, we may not want to use kgx for graph summary anyway), and it's doing it in sort of an ugly way by having all of the ingests enumerated in the Jenkinsfile. We'll want to hide that within our own Python launch code and ideally a yaml configuration for the list of ingests.
This also isn't yet making the transfer to latest, but I'm not sure this job actually should.
We should make a parameterized Jenkins job that we can run for each ingest.
It should download, run the transform, kgx validate the output, produce a kgx graph summary and then upload the kgx files, validation output (I think that exists now?) and the graph summary to a date labeled bucket, and probably also copy to a latest directory.