azavea / noaa-hydro-data

NOAA Phase 2 Hydrological Data Processing
11 stars 3 forks source link

Evaluate cost of Kubernetes Workflow Actions #103

Closed rajadain closed 1 year ago

rajadain commented 2 years ago

This issue is to figure out how to estimate the costs of running larger jobs on the Kubernetes cluster. There are two pathways by which we could proceed: Argo workflows and Jupyter notebooks.

jpolchlo commented 1 year ago

A data point for this issue: in #122, I ran what I would consider to be a mid-sized Dask job via Argo (48 workers with 8GB RAM for about two hours). Based on the node usage (two r5.8xlarge and one r5.xlarge spot instances), the compute costs were about $2.60. I can't say what the S3 charges were for writing the output. Because we're using Karptenter, the most cost-effective array of nodes was chosen for us automatically. We should be able to guess from the number and size of workers (and the demands of the scheduler) how many and what kinds of nodes will be needed, from which we can back out the per-hour cost of running the job, though we can't necessarily predict the full runtime of a job without some small scale tests that we can extrapolate from.