populationgenomics / analysis-runner

MIT License
2 stars 4 forks source link

Update hail in dataproc and support multiple versions #644

Closed daniaki closed 12 months ago

daniaki commented 1 year ago

Context: https://centrepopgen.slack.com/archives/C04M1G5HLM9/p1690947796216669

TLDR: Fixes a few issues in the dataproc image

  1. Python 3.8 is incompatible with cpg-utils (new style type annoations) so this is updated to 3.10 by bumping Ubuntu to 22.04. cpg-utils is required to authenticating with github to clone private repos and this step was failing.
  2. Adds fix to dataproc image which addresses pip depdendency resolution error in deploy.yaml when dataproc cluster is being initialised

Also updates documentation to explain these things.


Michael note: I'll follow-up and update the authentication of this repo in another PR.