devspace-sh / devspace

DevSpace - The Fastest Developer Tool for Kubernetes ⚡ Automate your deployment workflow with DevSpace and develop software directly inside Kubernetes.
https://devspace.sh
Apache License 2.0
4.3k stars 360 forks source link

double latest:latest tag in kind registry #2809

Closed szymi- closed 7 months ago

szymi- commented 7 months ago

What happened?

I am trying to develop a custom airflow image (including development of some cusotm python packages - that's why I need devspace.sh) using airflow helm charts in local kubernetes on my laptop (I am using Ubuntu in WSL + kind as my kubernetes cluster). I am able to use the default airflow image successfully and trigger file synchronization into the airflow containers, but I would like to replace the image from helm charts with my custom image (it has some python dependencies installed needed for development, which are not present in the default container). I am unable to replace the image in helm charts - no matter how I try, I get Init:InvalidImageName from kubernetes:

NAME                                   READY   STATUS                  RESTARTS   AGE
airflow-postgresql-0                   1/1     Running                 0          35s
airflow-redis-0                        1/1     Running                 0          35s
airflow-run-airflow-migrations-zk5qr   0/1     InvalidImageName        0          35s
airflow-scheduler-869cb647c9-zgbq2     0/2     Init:InvalidImageName   0          35s
airflow-statsd-5667dd85ff-zqvpk        1/1     Running                 0          35s
airflow-triggerer-0                    0/2     Init:InvalidImageName   0          35s
airflow-webserver-74c6d9b64c-lp7rw     0/1     Init:InvalidImageName   0          35s
airflow-worker-0                       0/2     Init:InvalidImageName   0          35s

When I inspect the pods to see what images are being used, I get images with double latest:latest tag as follows (or double tags like latest:2.7.1 etc, in other combinations), no matter what I try:

$ kubectl get pods -n airflow -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spe
c.containers[*]}{.image}{", "}{end}{end}' |sort

airflow-postgresql-0:   docker.io/bitnami/postgresql:16.1.0-debian-11-r15,
airflow-redis-0:        redis:7-bookworm,
airflow-run-airflow-migrations-zk5qr:   custom/apache-airflow:latest:latest,
airflow-scheduler-869cb647c9-zgbq2:     custom/apache-airflow:latest:latest, custom/apache-airflow:latest:latest,
airflow-statsd-5667dd85ff-zqvpk:        quay.io/prometheus/statsd-exporter:v0.26.0,
airflow-triggerer-0:    custom/apache-airflow:latest:latest, custom/apache-airflow:latest:latest,
airflow-webserver-74c6d9b64c-lp7rw:     custom/apache-airflow:latest:latest,
airflow-worker-0:       custom/apache-airflow:latest:latest, custom/apache-airflow:latest:latest,

I tried adding registry url to the image deffinition, like localhost:5001 or kind-registry:5000 but the tags were always double.

What did you expect to happen instead?

The images used should come from the kind's built in registry, with a single tag, like: custom/apache-airflow:latest. The airflow cluster should start up successfully using my custom airflow docker image. I should be able to synchronize dags (and more files in the future) into the cluster.

How can we reproduce the bug? (as minimally and precisely as possible)

Use my devspace.yaml and my Dockerfile (just copy constraints from here: https://[raw.githubusercontent.com/apache/airflow/constraints-2.7.3/constraints-3.11.txt](https://raw.githubusercontent.com/apache/airflow/constraints-2.7.3/constraints-3.11.txt and add requirements.txt with any python dependency). Run devspace dev and observe the problems I described above.

My devspace.yaml:

version: v2beta1
name: custom-airflow

vars:
  IMAGE: custom/apache-airflow:latest

# Configuration to build a DevImage
images:
  custom-airflow:
    image: ${IMAGE}
    dockerfile: ./Dockerfile
    rebuildStrategy: ignoreContextChanges
    buildArgs:
      VERSION: 2.7.1-python3.11

# Configuration to deploy the application
deployments:
  airflow:
    helm:
      chart:
        repo: https://airflow.apache.org
        name: airflow
      values:
        images:
          airflow:
            repository: custom/apache-airflow
            tag: latest

dev:
  webserver:
    labelSelector:
      component: webserver
    sync:
      - path: ./dags:/opt/airflow/dags
    ports:
      - port: "8080:8080"
  worker:
    labelSelector:
      component: worker
    containers:
      worker:
        sync:
          - path: ./dags:/opt/airflow/dags
  triggerer:
    labelSelector:
      component: triggerer
    containers:
      triggerer:
        sync:
          - path: ./dags:/opt/airflow/dags
  scheduler:
    labelSelector:
      component: scheduler
    containers:
      scheduler:
        sync:
          - path: ./dags:/opt/airflow/dags

My Dockerfile:

ARG VERSION=latest
FROM apache/airflow:${VERSION}

ENV REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
USER root

RUN apt-get -y update \
  && apt-get install -y --no-install-recommends gcc libc6-dev git \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/*

COPY ./docker/requirements.txt /
COPY ./docker/constraints.txt /

USER airflow
RUN pip install \
  --upgrade \
  --no-cache-dir pip \
  && pip install \
  --no-cache-dir \
  "apache-airflow==${AIRFLOW_VERSION}" \
  -r /requirements.txt \
  -c /constraints.txt

Local Environment:

Anything else we need to know?

lizardruss commented 7 months ago

Hello, there are a couple options to avoid this issue. The first is to use updateImageTags: false. This will prevent DevSpace from attempting to rewrite image references that don't have tags. Here's a sample that worked for my testing:

deployments:
  airflow:
    updateImageTags: false
    helm:
      chart:
        repo: https://airflow.apache.org
        name: airflow
      values:
        images:
          airflow:
            repository: custom/apache-airflow
            tag: latest

The other option is to use DevSpace's ${runtime.variables} to be a little more explicit about what DevSpace should substitute (if you want anything substituted at all). These are more likely to be useful if you're using DevSpace's dynamic tags. Here's an example that also worked for me:

deployments:
  airflow:
    updateImageTags: false
    helm:
      chart:
        repo: https://airflow.apache.org
        name: airflow
      values:
        images:
          airflow:
            repository: ${runtime.images.custom-airflow.image}
            tag: ${runtime.images.custom-airflow.tag}

The first option is likely the better fit for your case, but the second could be helpful if using the latest tag causes issues with images not re-pulling in your environment.

Hope this helps!

szymi- commented 7 months ago

Thanks for your time. I went with the second option as I have a feeling it will save me (or a teammate) some unnecessary troubles in the future. It works :)

alexandradragodan commented 7 months ago

Hey @szymi- I'm glad Russ's suggestions unblocked your use case.

Closing this issue for now then, please feel free to re-open it if you're still facing similar issues!