kestra-io / plugin-docker

Apache License 2.0
3 stars 4 forks source link

Add Docker Pull and Docker Prune tasks #52

Closed anna-geller closed 1 month ago

anna-geller commented 1 month ago

Feature description

to manage the container lifecycle on a given worker

mgabelle commented 1 month ago

I also was wondering f it was relevant to add to the Run task the containerId in the outputs. I don't know how to easily stop the container once runned with the Stop task that requires a containerId. What do you think ?

mgabelle commented 1 month ago

Can you give me more informations about these two new tasks ?

anna-geller commented 1 month ago

for first one: Yup, already output: {{ outputs.star.taskRunner.containerId }}

id: dockerRedis
namespace: company.team

variables:
  host: host.docker.internal

tasks:
  - id: start
    type: io.kestra.plugin.docker.Run
    containerImage: redis
    wait: false
    portBindings:
      - "6379:6379"

  - id: sleep
    type: io.kestra.plugin.core.flow.Sleep
    duration: PT1S
    description: Wait for the Redis container to start

  - id: set
    type: io.kestra.plugin.redis.string.Set
    url: "redis://:redis@{{vars.host}}:6379/0"
    key: "key_string_{{execution.id}}"
    value: "{{flow.id}}"
    serdeType: STRING

  - id: get
    type: io.kestra.plugin.redis.string.Get
    url: "redis://:redis@{{vars.host}}:6379/0"
    key: "key_string_{{execution.id}}"
    serdeType: STRING

  - id: assert
    type: io.kestra.plugin.core.execution.Assert
    errorMessage: "Invalid get data {{outputs.get}}"
    conditions:
      - "{{outputs.get.data == flow.id}}"

  - id: delete
    type: io.kestra.plugin.redis.string.Delete
    url: "redis://:redis@{{vars.host}}:6379/0"
    keys:
      - "key_string_{{execution.id}}"

  - id: getAfterDelete
    type: io.kestra.plugin.redis.string.Get
    url: "redis://:redis@{{vars.host}}:6379/0"
    key: "key_string_{{execution.id}}"
    serdeType: STRING

  - id: assertAfterDelete
    type: io.kestra.plugin.core.execution.Assert
    errorMessage: "Invalid get data {{outputs.getAfterDelete}}"
    conditions:
      - "{{(outputs.getAfterDelete contains 'data') == false}}"

finally:
  - id: stop
    type: io.kestra.plugin.docker.Stop
    containerId: "{{outputs.start.taskRunner.containerId}}"
anna-geller commented 1 month ago

Docker Pull

Main use case: when user has large container images and prefers to have separate task to pull them to a given worker e.g. on schedule.

id: dockerPull
namespace: system

tasks:
  - id: ee
    type: io.kestra.plugin.docker.Pull
    containerImage: "redis:latest"
    workerGroup:
      key: myworker

Matt TBD which extra properties might be useful based on what is available in the Java SDK. Example props we see in CLI:

➜  ~ docker pull --help

Usage:  docker pull [OPTIONS] NAME[:TAG|@DIGEST]

Download an image from a registry

Aliases:
  docker image pull, docker pull

Options:
  -a, --all-tags                Download all tagged images in the repository
      --disable-content-trust   Skip image verification (default true)
      --platform string         Set platform if server is multi-platform
                                capable
  -q, --quiet                   Suppress verbose output
anna-geller commented 1 month ago

Docker Prune task

docker system prune is the main command that removes unused data from the Docker system. It supports several flags that allow you to control what gets pruned.

Available Options:

Docker Rmi task

Remove one or more images

Aliases: docker image rm, docker image remove, docker rmi

Options: -f, --force Force removal of the image --no-prune Do not delete untagged parents

loicmathieu commented 1 month ago

prune seems to be a blunt tool!

I think we need to go back to what's needed, we want to be able to remove containers and images created/pulled from the flow so we better have docker.Rm and docker.Rmi.

Pull is not needed as we pull from the Run task.

anna-geller commented 1 month ago

adding extra info for posterity: we synced via huddle and we'll have both Rmi + Prune