PrefectHQ / prefect-helm

Helm charts for deploying Prefect Services
Apache License 2.0
83 stars 54 forks source link

Add support for baseJobTemplate to be passed into helmrelease as configmap #334

Closed marcincuber closed 1 month ago

marcincuber commented 1 month ago

I am using helm chart for prefect workers and deploying it using helmrelease flux2 resource. I would like to request for the chart to support passing in configmap which will contain jobtemplate in json format.

jamiezieziula commented 1 month ago

Hi @marcincuber - to clarify, you'd like to be able to pass the name of an already existing configmap to the worker helm chart?

marcincuber commented 1 month ago

@jamiezieziula Yes, instead of passing in json string to baseJobTemplate argument. I would like to specify a configmap name which will contain json of the jobtemplate.

In the flux helmrelease resource there isn't a nice way to pass in a json file https://fluxcd.io/flux/components/helm/helmreleases/.

jamiezieziula commented 1 month ago

Hi @marcincuber -- I'll take a look!

FWIW, we use flux internally, this is what our helmrelease resource looks like:

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: prefect-worker
spec:
  chart:
    spec:
      chart: prefect-worker
      version: 2024.5.2224951
      sourceRef:
        kind: HelmRepository
        name: prefect
        namespace: flux-system
  install:
    remediation:
      retries: 3
  interval: 5m
  maxHistory: 2
  upgrade:
    remediation:
      retries: 3
  values:
    worker:
      config:
        workPool: my-work-pool
        baseJobTemplate: |
          {
            "variables": {
              "type": "object",
              "properties": {
                "env": {
                  "type": "object",
                  "title": "Environment Variables",
                  "description": "Environment variables to set when starting a flow run.",
                  "additionalProperties": {
                    "type": "string"
                  }
                },
                "name": {
                  "type": "string",
                  "title": "Name",
                  "description": "Name given to infrastructure created by a worker."
                },
                "image": {
                  "type": "string",
                  "title": "Image",
                  "example": "docker.io/prefecthq/prefect:2-latest",
                  "description": "The image reference of a container image to use for created jobs. If not set, the latest Prefect image will be used."
                },
                "labels": {
                  "type": "object",
                  "title": "Labels",
                  "description": "Labels applied to infrastructure created by a worker.",
                  "additionalProperties": {
                    "type": "string"
                  }
                },
                "command": {
                  "type": "string",
                  "title": "Command",
                  "description": "The command to use when starting a flow run. In most cases, this should be left blank and the command will be automatically generated by the worker."
                },
                "namespace": {
                  "type": "string",
                  "title": "Namespace",
                  "default": "my-namespace",
                  "description": "The Kubernetes namespace to create jobs within."
                },
                "stream_output": {
                  "type": "boolean",
                  "title": "Stream Output",
                  "default": true,
                  "description": "If set, output will be streamed from the job to local standard output."
                },
                "cluster_config": {
                  "allOf": [
                    {
                      "$ref": "#/definitions/KubernetesClusterConfig"
                    }
                  ],
                  "title": "Cluster Config",
                  "description": "The Kubernetes cluster config to use for job creation."
                },
                "finished_job_ttl": {
                  "type": "integer",
                  "title": "Finished Job TTL",
                  "default": 600,
                  "description": "The number of seconds to retain jobs after completion. If set, finished jobs will be cleaned up by Kubernetes after the given delay. If not set, jobs will be retained indefinitely."
                },
                "image_pull_policy": {
                  "enum": [
                    "IfNotPresent",
                    "Always",
                    "Never"
                  ],
                  "type": "string",
                  "title": "Image Pull Policy",
                  "default": "Always",
                  "description": "The Kubernetes image pull policy to use for job containers."
                },
                "service_account_name": {
                  "type": "string",
                  "title": "Service Account Name",
                  "description": "The Kubernetes service account to use for job creation."
                },
                "job_watch_timeout_seconds": {
                  "type": "integer",
                  "title": "Job Watch Timeout Seconds",
                  "description": "Number of seconds to wait for each event emitted by a job before timing out. If not set, the worker will wait for each event indefinitely."
                },
                "pod_watch_timeout_seconds": {
                  "type": "integer",
                  "title": "Pod Watch Timeout Seconds",
                  "default": 60,
                  "description": "Number of seconds to watch for pod creation before timing out."
                }
              },
              "definitions": {
                "KubernetesClusterConfig": {
                  "type": "object",
                  "title": "KubernetesClusterConfig",
                  "required": [
                    "config",
                    "context_name"
                  ],
                  "properties": {
                    "config": {
                      "type": "object",
                      "title": "Config",
                      "description": "The entire contents of a kubectl config file."
                    },
                    "context_name": {
                      "type": "string",
                      "title": "Context Name",
                      "description": "The name of the kubectl context to use."
                    }
                  },
                  "description": "Stores configuration for interaction with Kubernetes clusters.\n\nSee `from_file` for creation.",
                  "secret_fields": [],
                  "block_type_slug": "kubernetes-cluster-config",
                  "block_schema_references": {}
                }
              },
              "description": "Default variables for the Kubernetes worker.\n\nThe schema for this class is used to populate the `variables` section of the default\nbase job template."
            },
            "job_configuration": {
              "env": "{{ env }}",
              "name": "{{ name }}",
              "labels": "{{ labels }}",
              "command": "{{ command }}",
              "namespace": "{{ namespace }}",
              "job_manifest": {
                "kind": "Job",
                "spec": {
                  "template": {
                    "spec": {
                      "containers": [
                        {
                          "env": "{{ env }}",
                          "args": "{{ command }}",
                          "name": "prefect-job",
                          "image": "{{ image }}",
                          "imagePullPolicy": "{{ image_pull_policy }}"
                        }
                      ],
                      "completions": 1,
                      "parallelism": 1,
                      "restartPolicy": "Never",
                      "serviceAccountName": "{{ service_account_name }}"
                    }
                  },
                  "backoffLimit": 0,
                  "ttlSecondsAfterFinished": "{{ finished_job_ttl }}"
                },
                "metadata": {
                  "labels": "{{ labels }}",
                  "namespace": "{{ namespace }}",
                  "generateName": "{{ name }}-"
                },
                "apiVersion": "batch/v1"
              },
              "stream_output": "{{ stream_output }}",
              "cluster_config": "{{ cluster_config }}",
              "job_watch_timeout_seconds": "{{ job_watch_timeout_seconds }}",
              "pod_watch_timeout_seconds": "{{ pod_watch_timeout_seconds }}"
            }
          }
      cloudApiConfig:
        accountId: xxxxxxxxxxxx
        workspaceId: xxxxxxxxxxxx
marcincuber commented 1 month ago

@jamiezieziula thanks for providing your example but I am basically trying to avoid having the lengthy json inside helmrelease resource. I believe it would be much cleaner to isolate it into a different resource ideally a configmap which I suggested.