docker / compose

Define and run multi-container applications with Docker
https://docs.docker.com/compose/
Apache License 2.0
32.98k stars 5.1k forks source link

[BUG] incorrect label com.docker.compose.network #10797

Closed daniejstriata closed 10 months ago

daniejstriata commented 12 months ago

Description

I see that a change for better diagnostic message on network label mismatch by @ndeloof in https://github.com/docker/compose/pull/10639 is possibly creating the following errors when I'm updating zabbix or gitlab using version 2.19.1. I am able to update when using docker-compose version 2.18.1

network zabbix was found but has incorrect label com.docker.compose.network set to "zabbix"

        "Labels": {
            "com.docker.compose.network": "zabbix",
            "com.docker.compose.project": "zabbix-server",
            "com.docker.compose.version": "1.25.5"
        }

network gitlab was found but has incorrect label com.docker.compose.network set to "gitlab"

        "Labels": {
            "com.docker.compose.network": "gitlab",
            "com.docker.compose.project": "gitlab",
            "com.docker.compose.version": "1.25.5"
        }

I'm not sure how to fix this. Could the error be a bit more descriptive? Can I delete the com.docker.compose.network label or must it be another value?

Steps To Reproduce

docker-compose --env-file=/opt/docker/.env.prod.aws -f /opt/devops/zabbix-server/docker-compose.yml up -d --force-recreate

Compose Version

2.19.1

Docker Environment

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
  scan: Docker Scan (Docker Inc.)

Server:
 Containers: 5
  Running: 5
  Paused: 0
  Stopped: 0
 Images: 17
 Server Version: 20.10.6
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc version: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.17-2136.320.7.1.el7uek.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 15.12GiB
 Name: gaewp-dkl
 ID: NDCY:5WEL:3U23:VFFD:QE7A:ZFZW:3RW3:Q5O3:7NLE:5LQ4:JXDF:5YR4
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Anything else?

No response

hugoghx commented 11 months ago

My error: network net1 was found but has incorrect label com.docker.compose.network set to "net1".

This problem is present on v2.20.0 and v2.19.1. It is not present on v2.18.1.

My docker-compose file's networks looks like this:

networks:
  default:
    name: "net1"

  net1:
    name: "net1"
    ipam:
      config:
        - subnet: "172.21.0.0/16"

  internalnet:
    name: "internalnet"
    driver: "bridge"
    internal: true

  vpn:
    name: "vpn"
    driver: "bridge"
    ipam:
      config:
        - subnet: "172.10.0.0/16"

These networks are then used by the services based on connectivity requirements.

I used

networks:
  default:
    name: "net1"

To set the default network net1 for all containers. With the introduction of v2.19.1, this stopped working.

ndeloof commented 11 months ago

@hugoamvieira you should not have default network set with same name as net1 network, from a compose model point of view those define two networks. Should be just one used as default:

networks:
  default:
    name: "net1"
    ipam:
      config:
        - subnet: "172.21.0.0/16"
ndeloof commented 11 months ago

@daniejstriata please attach your compose.yaml file for diagnostic

daniejstriata commented 11 months ago

@ndeloof Here it is:

version: "3.8"
services:
  gitlab:
    image: gitlab/gitlab-ce:15.11.11-ce.0 
    container_name: gitlab
    hostname: ${DOCKERHOSTNAME}
    networks:
      nonet:
        ipv4_address: ${IP}
    restart: always
    ports:
      - ${LISTENIP}:443:443
      - ${LISTENIP}:80:80
      - ${LISTENIPSSH}:22:22
    volumes:
      - /opt/devops/dockers/projects/gitlab/sshd_config.d/:/etc/ssh/sshd_config.d:ro
      - /opt/devops/dockers/projects/gitlab/conf/:/etc/gitlab:rw
      - logs:/var/log/gitlab:rw
      - data:/var/opt/gitlab:rw

networks:
  nonet:
    name: gitlab
    ipam:
      driver: default
      config:
        - subnet: ${SUBNET}

volumes:
  data:
    name: gitlab-data
    external: true
  logs:
    name: gitlab-logs
    external: true
hugoghx commented 11 months ago

@hugoamvieira you should not have default network set with same name as net1 network, from a compose model point of view those define two networks. Should be just one used as default:

networks:
  default:
    name: "net1"
    ipam:
      config:
        - subnet: "172.21.0.0/16"

Hey @ndeloof, thanks for the tip! I did try this, but I ran into another problem where I couldn't reference net1 in the containers' networks block. I saw this error: service "<service_name>" refers to undefined network net1: invalid compose project

As far as my understanding goes, if I define a networks block for a container, it will only be associated with the networks within that block, even if a default is specified.

So, if I want a container to be on a network in addition to the default one, I have to specify the default one. This is what I tried:

container:
  image: "bla"
  ...
  networks:
    - "net1"
    - "other-net"

But this leads to the error above.

Perhaps the correct way here is to refer to the default network not by its name but by its object key default?

container:
  image: "bla"
  ...
  networks:
    - "default"
    - "other-net"

This seemed to work, and inspecting the network at runtime does show the containers are associated with net1.

rtaft commented 11 months ago

I'll throw my error out there too, started after an update to 2.19.1.

WARN[0000] a network with name traefik exists but was not created by compose.
Set `external: true` to use an existing network
network traefik was found but has incorrect label com.docker.compose.network set to ""
networks:
  db:
  traefik:
    external: true

Nothing has changed in our compose file.

ndeloof commented 11 months ago

Perhaps the correct way here is to refer to the default network not by its name but by its object key default?

yes indeed, the networks attribute in service definition refers to keys. By the way you can use net1 if you prefer this over default, just don't use 2 entries to define a single network

mrsarm commented 11 months ago

Same issue. Downgrading to 2.18.1 solved it.

If you are going to do the downgrade as a workaround, don't expect (like me 😅) that downgrading docker will downgrade the compose plugin, at least in Ubuntu using the docker-ce distribution you have to downgrade the compose plugin like this:

sudo apt install docker-compose-plugin=2.18.1-1~ubuntu.20.04~focal
quiin commented 11 months ago

I had the same "incorrect label" issue and after a docker system prune, everything went back to normal.

ilude commented 11 months ago

Run the following command to downgrade your docker-compose cli plugin on Ubuntu 22.04 (and above probably)

curl -SL https://github.com/docker/compose/releases/download/v2.18.1/docker-compose-linux-x86_64 -o /usr/local/lib/docker/cli-plugins/docker-compose
rtaft commented 11 months ago

@quiin That was the first thing I tried with no luck.

Our build structure is a bit more complex, we run multiple YML files together. If I only run one service at a time, it works fine on 2.20, but when I run multiple services at once, it fails. Each YML has the same networks entry...with the comment above about "don't use 2 entries to define a single network", I wonder if that would break with having the same network in more than one YML file.

ndeloof commented 11 months ago

I wonder if that would break with having the same network in more than one YML file.

network name is not unique, so running compose concurrently with the same config is at risk when it comes to network as network might be created multiple times and then confuses the other runs. A possible workaround is to first run a dump compose file which just creates required network and run some "hello world" container, then you should be able to run compose concurrently with same network configuration, as all execution will see an existing network matching the expected config.

AnjaLiebermann commented 11 months ago

How can I downgrade docker compose on MacOs? And yes I have also the bug with Docker Compose version v2.19.1

rtaft commented 11 months ago

@AnjaLiebermann How did you install it? I had to manually remove all docker components that were installed using brew and install an older version of Docker Desktop from https://docs.docker.com/desktop/release-notes/

adRn-s commented 11 months ago

I'm having the same issue with an app that offers the possibility to be deployed with either nginx or caddy web servers. Anyone on the affected versions of docker can reproduce the issue with git clone this repo and then try make dev-easy. When it comes to the step to use caddy.yml compose file, it will fail with the reported error message, same as every user has reported here.

I wonder if we should wait for a fix or try another strategy to combine multiple YAML files.

edit: just to make it clear, by another strategy I'm referring to the second method described in the docs, so far no one reported having this issue with the extends: directive.. of course, there's no warrant that this would avoid the current bug.

tflavin commented 11 months ago

TL;DR: Not sure this will solve every issue regarding this, but I was able to remove my existing network and recreate it with the default label to get past this issue.

docker network rm networkname
docker network create networkname --label "com.docker.compose.network=default"

I ran into this as well. It looks like the change causing this was actually https://github.com/docker/compose/pull/10612, also by @ndeloof - the one mentioned in the OP is regarding a warning that's similar but different, but this change is what adds the error itself.

The error I was seeing is the same as @rtaft's comment.

I don't have any labels set, so if inspect.Labels[api.NetworkLabel] is "" then I thought my expectedNetworkLabel would also be "". The error message is not showing the expectedNetworkLabel, so I wasn't sure what it was.

hopefully-relevant excerpt from compose file:

services:
  ...

  proxy:
    ...
    networks:
      - a_default
      - b_default

networks:
  a_default:
    external: true
  b_default:
    external: true

If I docker network rm both networks and try again, I get network a_default declared as external, but could not be found, as I would expect. I then tried to recreate them with docker network create and guessed at a label name, ex: --label "com.docker.compose.network=a_default", but I got network a_default was found but has incorrect label com.docker.compose.network set to "a_default"

Then, I tried with default as the network value: docker network create a_default --label "com.docker.compose.network=default" and it seemed to work. So, that's my guess as to what expectedNetworkLabel is.

I did rm this network and create a new one to see if I could consistently reproduce it, and it did again get past this error, but then I got an error: my app was still referencing that first removed network instead of the new one. So, I did a docker system prune and was able to reproduce by creating a new network with the default label. My b_default network was able to still be created without a label.

I looked at what I believe to be the networking documentation (1/2/3/4) but I can't find where it says that at least one default label is now required when creating networks. (Admittedly, I could be looking in the wrong place or for the wrong thing.)

Is this now necessary, or could the code handle for this case without having to set a default label when creating the network? If it is necessary, I think it could be added (or added more clearly) to the documentation.

I also think I am correctly setting the networks as external: true, though I'm also seeing that warning, but since it's just a warning, it's not blocking me.

I tried to be thorough in looking into this, since I wasn't very familiar with this topic -- apologies if I got anything wrong and thanks in advance for any help!

AlexisGoodfellow commented 11 months ago

Just going to throw out there that I'm also running into this same error on 2.19.1 with a host machine running macOS Ventura (13.4.1), so the issue seems to not be strictly limited to particular linux distros (like Ubuntu, as previously reported).

I will try downgrading (somehow) to 2.18.1 in the meantime and see if that resolves this issue.

AlexisGoodfellow commented 11 months ago

@AnjaLiebermann If you still need it, the accepted answer here outlines the process of how I downgraded on macOS: https://stackoverflow.com/questions/62217678/can-i-roll-back-to-a-previous-version-of-docker-desktop

I downloaded and installed version 4.20.0, and that seems to be doing exactly what I need to get my container fleet running again. Yay!

rtaft commented 11 months ago

@tflavin default did not work for me but using the network name did.

adRn-s commented 11 months ago

Before downgrading Docker, you may try this solution. It works for me.

At my docker-compose YAML files, I'm using:

networks:
  myname:
    external: true

I create the network manually, with docker network create myname beforehand. Then, I can use all the YAML files as modules (I start containers that share this network with docker compose -f something.yaml up -d ).

Note: If I ever shut all containers down, the network stays. You'd need to remove it manually if you want to..

Scherebart commented 11 months ago

I may confirm, that the problem does not occur on older version of docker-compose. I did try on Mac the docker desktop 4.19 with docker-compose 2.17

ndeloof commented 11 months ago

@Scherebart earlier version didn't checked network labels and as a result may select a network "by name" that is not the expected one (name unicity is not enforced by docker engine) resulting in weird bugs

NachoGomezC commented 11 months ago

I had this same issue after updating to the latest docker/compose versions on Ubuntu 22.04. I ended up pruning all networks with all my services down and after bringing them up again, voila! Networks were created successfully and services started without issues

ndeloof commented 10 months ago

@sd-evo docker network prune

bertrandgorge commented 10 months ago

For those in need for a solution on this one, here are my findings:

First, I had to do a docker network ls --format=json in order to understand the problem:

{"CreatedAt":"2023-09-06 14:29:15.821708941 +0200 CEST",
"Driver":"bridge",
"ID":"0a470b25babd",
"IPv6":"false",
"Internal":"false",
"Labels":"com.docker.compose.network=tripleperformance_traefik,com.docker.compose.project=tripleperformance_prod,com.docker.compose.version=1.29.2",
"Name":"tripleperformance_traefik",
"Scope":"local"}

Here you can see that the docker compose version was 1.29.2 and not at all 2.20.2 as it should have been. The reason was that somehow, there was still an old "docker-compose" script in /usr/bin that has been used to recreate the stack.

So here are the steps required to repair the stack:

  1. replace the content of /usr/bin/docker-compose with docker compose --compatibility "$@"
  2. run docker compose down to stop your stack completely
  3. run docker system prune to remove the network
  4. run docker compose up -d to start your stack again

Hope it helps

michiel-nwa commented 5 months ago

Got the same error, for me it was the following: I had networks like:

networks:
  my_network:
    driver: bridge

after docker upgrade I got the same error, I was able to fix it by adding an explicit name:

networks:
  my_network:
    name: "my_network"
    driver: bridge