docker / cli

The Docker CLI
Apache License 2.0
4.92k stars 1.93k forks source link

docker node ps with multiple nodes duplicates entries for the first node specified #4155

Open danielboydston opened 1 year ago

danielboydston commented 1 year ago

Description

When executing 'docker node ps node1 node2', containers running on node1 will appear in the list twice, while containers for node2 will appear only once.

Switching the order of the nodes 'docker node ps node2 node1' results in the containers on node2 being listed twice.

Running the command when specifying only one node correctly displays each container only once.

image

Reproduce

  1. Create a docker swarm with two nodes
  2. Start containers on each node
  3. Execute the following command: docker node ps node1 node2

Expected behavior

Output should be a list of all containers running in the swarm with a single entry for each container.

docker version

Client:
 Version:           20.10.5+dfsg1
 API version:       1.41
 Go version:        go1.15.15
 Git commit:        55c4c88
 Built:             Mon May 30 18:34:49 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.5+dfsg1
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.15.15
  Git commit:       363e9a8
  Built:            Mon May 30 18:34:49 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.13~ds1
  GitCommit:        1.4.13~ds1-1~deb11u3
 runc:
  Version:          1.0.0~rc93+ds1
  GitCommit:        1.0.0~rc93+ds1-5+deb11u2
 docker-init:
  Version:          0.19.0
  GitCommit:

docker info

Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 72
  Running: 19
  Paused: 0
  Stopped: 53
 Images: 95
 Server Version: 20.10.5+dfsg1
 Storage Driver: btrfs
  Build Version: Btrfs v5.10.1
  Library Version: 102
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: active
  NodeID: 9ogylkrii6cpz7knjlc344gx1
  Is Manager: true
  ClusterID: lduvkp1hggliuxt8ma1eubdn6
  Managers: 1
  Nodes: 2
  Default Address Pool: 10.0.0.0/8
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 3
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 3 months
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 199.58.96.21
  Manager Addresses:
   199.58.96.21:2377
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1.4.13~ds1-1~deb11u3
 runc version: 1.0.0~rc93+ds1-5+deb11u2
 init version:
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.0-20-amd64
 Operating System: Debian GNU/Linux 11 (bullseye)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 62.81GiB
 Name: docker-3
 ID: H3TV:JYMX:QUUS:VALV:3SCD:5TMI:QH7W:NCWV:SWLC:FRGY:AQR3:XVOK
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: Support for cgroup v2 is experimental

Additional Info

No response

thaJeztah commented 1 year ago

Thanks for reporting! I gave this a quick try, and was able to reproduce this. Had a (brief) glance and the CLI code, and nothing immediately stood out as incorrect on the CLI side :thinking_face:

The command defaults to use self for filtering tasks (i.e., show tasks running on the manager node you're connected to), but that gets replaced if one or more nodes are passed as argument; https://github.com/docker/cli/blob/a0756c3c2cacebf5e5dc6454cc280c3ddf675176/cli/command/node/ps.go#L36-L40

After which it collect tasks for each node (getting the tasklist for each node (filtering the tasks based on node-ID)); https://github.com/docker/cli/blob/a0756c3c2cacebf5e5dc6454cc280c3ddf675176/cli/command/node/ps.go#L65-L88

Some possibilities;

My reproducing was on docker 23.0.2;

docker node ls
ID                            HOSTNAME        STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
yg550ettvsjn6g6t840iaiwgb *   swarm-test-01   Ready     Active         Reachable        23.0.2
2lm9w9kbepgvkzkkeyku40e65     swarm-test-02   Ready     Active         Leader           23.0.2
hc0pu7ntc7s4uvj4pv7z7pz15     swarm-test-03   Ready     Active         Reachable        23.0.2
n41b2cijmhifxxvz56vwrs12q     swarm-test-04   Ready     Active                          23.0.2
docker node ps
ID             NAME            IMAGE                        NODE            DESIRED STATE   CURRENT STATE           ERROR                              PORTS
hyrvvroi2ve0   service.1       thajeztah/redacted:latest    swarm-test-01   Running         Running 5 days ago
n0k48214ocjx    \_ service.1   thajeztah/redacted:latest    swarm-test-01   Shutdown        Failed 5 days ago       "task: non-zero exit (2)"
t4aoc5bom8lj   jvs.2           thajeztah/redacted2          swarm-test-01   Shutdown        Rejected 2 months ago   "No such image: thajeztah/reda…"
v5d1wdx5hguk    \_ jvs.2       thajeztah/redacted2          swarm-test-01   Shutdown        Rejected 2 months ago   "No such image: thajeztah/reda…"
wawc9957rrc2    \_ jvs.2       thajeztah/redacted2          swarm-test-01   Shutdown        Rejected 2 months ago   "No such image: thajeztah/reda…"
root@swarm-test-01:~# docker node ps swarm-test-01 swarm-test-02
ID             NAME          IMAGE                        NODE            DESIRED STATE   CURRENT STATE           ERROR                              PORTS
hyrvvroi2ve0    \_ service.1   thajeztah/redacted:latest    swarm-test-01   Running         Running 5 days ago
hyrvvroi2ve0    \_ service.1   thajeztah/redacted:latest    swarm-test-01   Running         Running 5 days ago
n0k48214ocjx    \_ service.1   thajeztah/redacted:latest    swarm-test-01   Shutdown        Failed 5 days ago       "task: non-zero exit (2)"
n0k48214ocjx    \_ service.1   thajeztah/redacted:latest    swarm-test-01   Shutdown        Failed 5 days ago       "task: non-zero exit (2)"
t4aoc5bom8lj    \_ jvs.2       thajeztah/redacted2:latest   swarm-test-01   Shutdown        Rejected 2 months ago   "No such image: thajeztah/reda…"
t4aoc5bom8lj    \_ jvs.2       thajeztah/redacted2:latest   swarm-test-01   Shutdown        Rejected 2 months ago   "No such image: thajeztah/reda…"
v5d1wdx5hguk    \_ jvs.2       thajeztah/redacted2:latest   swarm-test-01   Shutdown        Rejected 2 months ago   "No such image: thajeztah/reda…"
v5d1wdx5hguk    \_ jvs.2       thajeztah/redacted2:latest   swarm-test-01   Shutdown        Rejected 2 months ago   "No such image: thajeztah/reda…"
wawc9957rrc2    \_ jvs.2       thajeztah/redacted2:latest   swarm-test-01   Shutdown        Rejected 2 months ago   "No such image: thajeztah/reda…"
wawc9957rrc2    \_ jvs.2       thajeztah/redacted2:latest   swarm-test-01   Shutdown        Rejected 2 months ago   "No such image: thajeztah/reda…"
pd8ofvb3qy4q   jvs.4           thajeztah/redacted2:latest   swarm-test-02   Running         Running 3 weeks ago
jbszl4zfhoih    \_ jvs.4       thajeztah/redacted2:latest   swarm-test-02   Shutdown        Complete 3 weeks ago
ocj5hoan9wbz    \_ jvs.4       thajeztah/redacted2:latest   swarm-test-02   Shutdown        Failed 6 months ago     "No such container: jvs.4.ocj5…"
xpg5cyy30g8i    \_ jvs.4       thajeztah/redacted2:latest   swarm-test-02   Shutdown        Complete 6 months ago
erwindon commented 1 year ago

24.0.6 still has this problem [ where can I vote for this issue :-) ]

ehsan-salamati commented 3 months ago

Can I pick up this issue? I have started contributing to open-source projects, and this issue seems to be a good one to start with.

I also have a hunch that the problem is with how the filter is built in the for loop! https://github.com/docker/cli/blob/ddd4c399305bb4fc9a290a2cd321b55df11280a6/cli/command/node/ps.go#L77-L78 When running the docker node ps command with multiple nodes in the for loop, the filter state retains node-specific data from previous iterations, causing task entries to be duplicated for the first specified node. This occurs because the filter object, being a dictionary type, was reused and modified across iterations, accumulating previous node filters.

I will set up the project, and see if this is the problem!

thaJeztah commented 3 months ago

Can I pick up this issue?

Sure, go for it! I don't think anyone has had time to work on this, so go ahead 🤘

ehsan-salamati commented 3 months ago

@thaJeztah I created this PR