kreuzwerker / terraform-provider-docker

Terraform Docker provider
Mozilla Public License 2.0
570 stars 187 forks source link

Flaky `Error response from daemon: Conflict, cannot remove the default link name of the container` on `terraform destroy` #611

Open daniel-weisse opened 2 months ago

daniel-weisse commented 2 months ago

Community Note

Terraform (and docker Provider) Version

Tested with Terraform versions v1.8.0, v1.7.0, v1.7.5, v1.6.6, v1.5.7 Docker provider at version v3.0.2 Docker server at versions v26.0.0, v24.0.5, v25.0.3 containerd at versions v1.7.13, v1.7.14, v1.7.15

This list of versions are just the ones I tried. I was unable to find a version combination that would guarantee this error from disappearing.

Affected Resource(s)

Terraform Configuration Files

terraform {
  required_providers {
    docker = {
      source  = "kreuzwerker/docker"
      version = "3.0.2"
    }
  }
}

provider "docker" {
  host = "unix:///var/run/docker.sock"
}

resource "docker_image" "test_image" {
  name         = "alpine:latest"
  keep_locally = true
}

resource "docker_container" "test_container" {
  name         = "test-container"
  image        = docker_image.test_image.image_id
  rm           = true
  command = [
        "/bin/sh",
        "-c",
        "while sleep 3600; do :; done",
  ]
}

Debug Output

You can find the debug output from a failing run using the above Terraform config here: https://gist.github.com/daniel-weisse/ba8e9e1757c5de14808a7e3a550ed556 The output was generated using the following:

TF_LOG=DEBUG terraform apply -auto-approve
sleep 10
TF_LOG=DEBUG terraform destroy -auto-approve

Expected Behaviour

The Docker Terraform provider correctly terminates the container without errors every time it is called.

Actual Behaviour

Rarely, the Docker Terraform provider throws an error when running terraform destroy:

Error deleting container <container-id>: Error response from daemon: Conflict, cannot remove the default link name of the container

Steps to Reproduce

Assuming the Terraform config above is saved locally to main.tf, run the following script:

#!/bin/bash

set -e

terraform init

for i in range {0..300}
do
  terraform apply -auto-approve
  sleep 10
  terraform destroy -auto-approve
done

Since this bug seems very flaky, you may not see any failures, or it might fail at the very first iteration.

You can view a minimal Terraform configuration to create a VM in Azure with Ubuntu 22.04 here: https://gist.github.com/daniel-weisse/b44388adbb7f22e79e2964804d12b333 I used this to reproduce the error:

Important Factoids

I most commonly experienced this issue when running on Ubuntu 22.04 in an Azure VM. I also reproduced it locally running Arch Linux. I was unable to reproduce it locally on Fedora 39 in over 300 runs.

I would like to again repeat that this issue is very flakey. Sometimes I could reproduce it 1 in 5 runs, sometimes 300 runs didn't throw any errors.