devspace-sh / devspace

DevSpace - The Fastest Developer Tool for Kubernetes ⚡ Automate your deployment workflow with DevSpace and develop software directly inside Kubernetes.
https://devspace.sh
Apache License 2.0
4.38k stars 361 forks source link

double latest:latest tag in do #2808

Closed szymi- closed 8 months ago

szymi- commented 8 months ago

What happened?

I am trying to develop a custom airflow image (including development of some cusotm python packages - that's why I need devspace.sh) using airflow helm charts in local kubernetes on my laptop (I am using Ubuntu in WSL + kind as my kubernetes cluster). I am able to use the default airflow image successfully and trigger file synchronization into the airflow containers, but I would like to replace the image from helm charts with my custom image (it has some python dependencies installed needed for development, which are not present in the default container). I am unable to replace the image in helm charts - no matter how I try, I get Init:InvalidImageName from kubernetes:

NAME                                   READY   STATUS                  RESTARTS   AGE
airflow-postgresql-0                   1/1     Running                 0          35s
airflow-redis-0                        1/1     Running                 0          35s
airflow-run-airflow-migrations-zk5qr   0/1     InvalidImageName        0          35s
airflow-scheduler-869cb647c9-zgbq2     0/2     Init:InvalidImageName   0          35s
airflow-statsd-5667dd85ff-zqvpk        1/1     Running                 0          35s
airflow-triggerer-0                    0/2     Init:InvalidImageName   0          35s
airflow-webserver-74c6d9b64c-lp7rw     0/1     Init:InvalidImageName   0          35s
airflow-worker-0                       0/2     Init:InvalidImageName   0          35s

When I inspect the pods to see what images are being used, I get images with double latest:latest tag as follows (or double tags like latest:2.7.1 etc, in other combinations), no matter what I try:

$ kubectl get pods -n airflow -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spe
c.containers[*]}{.image}{", "}{end}{end}' |sort

airflow-postgresql-0:   docker.io/bitnami/postgresql:16.1.0-debian-11-r15,
airflow-redis-0:        redis:7-bookworm,
airflow-run-airflow-migrations-zk5qr:   custom/apache-airflow:latest:latest,
airflow-scheduler-869cb647c9-zgbq2:     custom/apache-airflow:latest:latest, custom/apache-airflow:latest:latest,
airflow-statsd-5667dd85ff-zqvpk:        quay.io/prometheus/statsd-exporter:v0.26.0,
airflow-triggerer-0:    custom/apache-airflow:latest:latest, custom/apache-airflow:latest:latest,
airflow-webserver-74c6d9b64c-lp7rw:     custom/apache-airflow:latest:latest,
airflow-worker-0:       custom/apache-airflow:latest:latest, custom/apache-airflow:latest:latest,

I tried adding registry url to the image deffinition, like localhost:5001 or kind-registry:5000 but the tags were always double.

What did you expect to happen instead?

The images used should come from the kind's built in registry, with a single tag, like: custom/apache-airflow:latest. The airflow cluster should start up successfully using my custom airflow docker image. I should be able to synchronize dags (and more files in the future) into the cluster.

How can we reproduce the bug? (as minimally and precisely as possible)

Use my devspace.yaml and my Dockerfile (just copy constraints from here: https://[raw.githubusercontent.com/apache/airflow/constraints-2.7.3/constraints-3.11.txt](https://raw.githubusercontent.com/apache/airflow/constraints-2.7.3/constraints-3.11.txt and add requirements.txt with any python dependency). Run devspace dev and observe the problems I described above.

My devspace.yaml:

version: v2beta1
name: custom-airflow

vars:
  IMAGE: custom/apache-airflow:latest

# Configuration to build a DevImage
images:
  custom-airflow:
    image: ${IMAGE}
    dockerfile: ./Dockerfile
    rebuildStrategy: ignoreContextChanges
    buildArgs:
      VERSION: 2.7.1-python3.11

# Configuration to deploy the application
deployments:
  airflow:
    helm:
      chart:
        repo: https://airflow.apache.org
        name: airflow
      values:
        images:
          airflow:
            repository: custom/apache-airflow
            tag: latest

dev:
  webserver:
    labelSelector:
      component: webserver
    sync:
      - path: ./dags:/opt/airflow/dags
    ports:
      - port: "8080:8080"
  worker:
    labelSelector:
      component: worker
    containers:
      worker:
        sync:
          - path: ./dags:/opt/airflow/dags
  triggerer:
    labelSelector:
      component: triggerer
    containers:
      triggerer:
        sync:
          - path: ./dags:/opt/airflow/dags
  scheduler:
    labelSelector:
      component: scheduler
    containers:
      scheduler:
        sync:
          - path: ./dags:/opt/airflow/dags

My Dockerfile:

ARG VERSION=latest
FROM apache/airflow:${VERSION}

ENV REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
USER root

RUN apt-get -y update \
  && apt-get install -y --no-install-recommends gcc libc6-dev git \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/*

COPY ./docker/requirements.txt /
COPY ./docker/constraints.txt /

USER airflow
RUN pip install \
  --upgrade \
  --no-cache-dir pip \
  && pip install \
  --no-cache-dir \
  "apache-airflow==${AIRFLOW_VERSION}" \
  -r /requirements.txt \
  -c /constraints.txt

Local Environment:

Anything else we need to know?

lizardruss commented 8 months ago

This looks like it may have the same solution as #2809