Closed CpuID closed 6 years ago
Seem to be able to repro this now, but only using docker-compose ps
and not docker ps
:
$ docker-compose ps
Name Command State Ports
----------------------------------------------------------------------------
api_api_1 api ... Up 8080/tcp
api_api_1 api ... Up 8080/tcp
api_nginx_2 dockerize -template /etc/n ... Up 0.0.0.0:33020->80/tcp
api_nginx_2 dockerize -template /etc/n ... Up 0.0.0.0:33020->80/tcp
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
03c6e86de214 api_nginx "dockerize -template…" 17 seconds ago Up 15 seconds 0.0.0.0:33020->80/tcp api_nginx_2
ae3cb0ddf362 api_api "api …" 20 seconds ago Up 18 seconds 8080/tcp api_api_1
This then causes flow-on effects like:
+ docker-compose up --no-color --exit-code-from api_tester
using --exit-code-from implies --abort-on-container-exit
Found orphan containers (api_nginx_2, api_api_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Hmmm, no I haven't. That said, we aren't running sockguard
in production presently. @chrissnell might have some insight.
I've got some sockguard logs but they have a bunch of stuff I don't want to share in them :) I'll get a set of logs from a throwaway test instead and drop them here. docker-compose ps
does a bunch of API calls, so it's probably some perfect storm situation which explains why docker ps
is fine.
Once I have a reliable repro, I'll work on a fix :) Would be great to get more test coverage in though (when either of us have time).
Reproduced locally:
$ cd examples/cgroup_parent
$ docker-compose build && docker-compose up
$ docker exec -it cgroup_parent_ci_agent_1 bash
root@29ce68e9d6de:/blah# pwd
/blah
root@29ce68e9d6de:/blah# cat Dockerfile
FROM alpine:3.8
CMD [ "sleep", "300" ]
root@29ce68e9d6de:/blah# cat docker-compose.yml
version: '2'
services:
blah:
build:
context: .
blah2:
build:
context: .
root@29ce68e9d6de:/blah# docker-compose up
Creating network "blah_default" with the default driver
Creating blah_blah_1 ... done
Creating blah_blah2_1 ... done
Attaching to blah_blah_1, blah_blah2_1
root@29ce68e9d6de:/blah# docker-compose ps
Name Command State Ports
----------------------------------------
blah_blah2_1 sleep 300 Up
blah_blah2_1 sleep 300 Up
blah_blah_1 sleep 300 Up
blah_blah_1 sleep 300 Up
sockguard logs for the docker-compose ps
operation:
#56 03:52:09.954554 GET - /v1.22/networks/blah_default - 0b
#56 03:52:09.954781 Looking up identifier "blah_default"
#56 03:52:09.956043 Labels for /networks/blah_default: map[]
#56 03:52:09.956093 Allow, /networks/blah_default has no owner
#56 03:52:09.957190 Copied 1165 bytes from socket
#56 03:52:09.958463 Copied 0 bytes from connection
#56 03:52:09.958560 Done, closing
#57 03:52:09.960072 GET - /v1.22/containers/json?limit=-1&all=1&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dblah%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D - 0b
2018/08/22 03:52:09 filters="{\"label\": [\"com.docker.compose.project=blah\", \"com.docker.compose.oneoff=False\"]}"
2018/08/22 03:52:09 [label] Got type []interface {}: [com.docker.compose.project=blah com.docker.compose.oneoff=False]
2018/08/22 03:52:09 map[string][]interface {}{}
#57 03:52:09.960208 Adding label com.buildkite.sockguard.owner=sockguard-pid-1 to label filters []
#57 03:52:09.961682 Copied 2529 bytes from socket
#57 03:52:09.962765 Copied 0 bytes from connection
#57 03:52:09.962780 Done, closing
#58 03:52:09.963968 GET - /v1.22/containers/b7fc03ab647777c4f4a5e2fdadabb2b2a76780cec74a8cc250a5e19ec42f63ad/json - 0b
#58 03:52:09.964122 Looking up identifier "b7fc03ab647777c4f4a5e2fdadabb2b2a76780cec74a8cc250a5e19ec42f63ad"
#58 03:52:09.965027 Labels for /containers/b7fc03ab647777c4f4a5e2fdadabb2b2a76780cec74a8cc250a5e19ec42f63ad/json: map[com.docker.compose.container-number:1 com.docker.compose.oneoff:False com.docker.compose.project:blah com.docker.compose.service:blah2 com.docker.compose.version:1.22.0 com.buildkite.sockguard.owner:sockguard-pid-1 com.docker.compose.config-hash:54503df7af4e2ab265e662be2d53096fc2a288dc62619dbb85b31263305c91b9]
#58 03:52:09.965195 Allow, /containers/b7fc03ab647777c4f4a5e2fdadabb2b2a76780cec74a8cc250a5e19ec42f63ad/json matches owner "sockguard-pid-1"
#58 03:52:09.965747 Copied 5029 bytes from socket
#58 03:52:09.966470 Copied 0 bytes from connection
#58 03:52:09.966556 Done, closing
#59 03:52:09.968232 GET - /v1.22/containers/659e069d7e794fa87f15305582e944e79293c454ae0b32072aa57675df7023ed/json - 0b
#59 03:52:09.968535 Looking up identifier "659e069d7e794fa87f15305582e944e79293c454ae0b32072aa57675df7023ed"
#59 03:52:09.969337 Labels for /containers/659e069d7e794fa87f15305582e944e79293c454ae0b32072aa57675df7023ed/json: map[com.docker.compose.service:blah com.docker.compose.version:1.22.0 com.buildkite.sockguard.owner:sockguard-pid-1 com.docker.compose.config-hash:54503df7af4e2ab265e662be2d53096fc2a288dc62619dbb85b31263305c91b9 com.docker.compose.container-number:1 com.docker.compose.oneoff:False com.docker.compose.project:blah]
#59 03:52:09.969440 Allow, /containers/659e069d7e794fa87f15305582e944e79293c454ae0b32072aa57675df7023ed/json matches owner "sockguard-pid-1"
#59 03:52:09.970747 Copied 5025 bytes from socket
#59 03:52:09.971499 Copied 0 bytes from connection
#59 03:52:09.971787 Done, closing
#60 03:52:09.973036 GET - /v1.22/containers/json?limit=-1&all=0&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dblah%22%2C+%22com.docker.compose.oneoff%3DTrue%22%5D%7D - 0b
2018/08/22 03:52:09 filters="{\"label\": [\"com.docker.compose.project=blah\", \"com.docker.compose.oneoff=True\"]}"
2018/08/22 03:52:09 [label] Got type []interface {}: [com.docker.compose.project=blah com.docker.compose.oneoff=True]
2018/08/22 03:52:09 map[string][]interface {}{}
#60 03:52:09.973755 Adding label com.buildkite.sockguard.owner=sockguard-pid-1 to label filters []
#60 03:52:09.975056 Copied 2529 bytes from socket
#60 03:52:09.976667 Copied 0 bytes from connection
#60 03:52:09.977218 Done, closing
#61 03:52:09.977847 GET - /v1.22/containers/b7fc03ab647777c4f4a5e2fdadabb2b2a76780cec74a8cc250a5e19ec42f63ad/json - 0b
#61 03:52:09.978098 Looking up identifier "b7fc03ab647777c4f4a5e2fdadabb2b2a76780cec74a8cc250a5e19ec42f63ad"
#61 03:52:09.979752 Labels for /containers/b7fc03ab647777c4f4a5e2fdadabb2b2a76780cec74a8cc250a5e19ec42f63ad/json: map[com.buildkite.sockguard.owner:sockguard-pid-1 com.docker.compose.config-hash:54503df7af4e2ab265e662be2d53096fc2a288dc62619dbb85b31263305c91b9 com.docker.compose.container-number:1 com.docker.compose.oneoff:False com.docker.compose.project:blah com.docker.compose.service:blah2 com.docker.compose.version:1.22.0]
#61 03:52:09.979810 Allow, /containers/b7fc03ab647777c4f4a5e2fdadabb2b2a76780cec74a8cc250a5e19ec42f63ad/json matches owner "sockguard-pid-1"
#61 03:52:09.980373 Copied 5029 bytes from socket
#61 03:52:09.983789 Copied 0 bytes from connection
#61 03:52:09.983888 Done, closing
#62 03:52:09.984121 GET - /v1.22/containers/659e069d7e794fa87f15305582e944e79293c454ae0b32072aa57675df7023ed/json - 0b
#62 03:52:09.984288 Looking up identifier "659e069d7e794fa87f15305582e944e79293c454ae0b32072aa57675df7023ed"
#62 03:52:10.029703 Labels for /containers/659e069d7e794fa87f15305582e944e79293c454ae0b32072aa57675df7023ed/json: map[com.docker.compose.container-number:1 com.docker.compose.oneoff:False com.docker.compose.project:blah com.docker.compose.service:blah com.docker.compose.version:1.22.0 com.buildkite.sockguard.owner:sockguard-pid-1 com.docker.compose.config-hash:54503df7af4e2ab265e662be2d53096fc2a288dc62619dbb85b31263305c91b9]
#62 03:52:10.029824 Allow, /containers/659e069d7e794fa87f15305582e944e79293c454ae0b32072aa57675df7023ed/json matches owner "sockguard-pid-1"
#62 03:52:10.037895 Copied 5025 bytes from socket
#62 03:52:10.039853 Copied 0 bytes from connection
#62 03:52:10.039869 Done, closing
docker-compose
:[nathan@ns-desktop-ub cgroup_parent (master)]$ docker network inspect blah_default | jq .[0].Labels
{}
Interesting, heres the culprit:
docker-compose
does 2 API calls to list containers, with 2 different values for the com.docker.compose.oneoff
label filter:
com.docker.compose.oneoff=False
com.docker.compose.oneoff=True
Since sockguard
returns values for both, that yields a duplicate container list for docker-compose
.
This API call should really not return these containers, due to the filter being mangled by sockguard (searching on com.docker.compose.oneoff=True
, returns False
results):
root@29ce68e9d6de:/blah# curl -s --unix-socket /var/run/docker.sock "http:/v1.22/containers/json?limit=-1&all=0&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dblah%22%2C+%22com.docker.compose.oneoff%3DTrue%22%5D%7D" | jq .
[
{
"Id": "a2f81f9b7cbc693d238bbf84c398fe2c7be7b056e948b8a608c325ee50c6f19d",
"Names": [
"/blah_blah_1"
],
"Image": "blah_blah",
"ImageID": "sha256:9b48aedb1fa786e0dceadfbf2dd5ac9c8f5119eb4afeab8de7f16b6d4178f4f8",
"Command": "sleep 1200",
"Created": 1534910550,
"Ports": [],
"Labels": {
"com.buildkite.sockguard.owner": "sockguard-pid-1",
"com.docker.compose.config-hash": "9864342fe990b836ed15b33259e44c8ad1d82fd968497d022ebe8e02dfe26f41",
"com.docker.compose.container-number": "1",
"com.docker.compose.oneoff": "False",
"com.docker.compose.project": "blah",
"com.docker.compose.service": "blah",
"com.docker.compose.version": "1.22.0"
},
"State": "running",
"Status": "Up 2 minutes",
"HostConfig": {
"NetworkMode": "blah_default"
},
"NetworkSettings": {
"Networks": {
"blah_default": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "e1ce451ca042fc00e700308fb2ff503da9efc3cbae0c6a6617fa4801efd7b2e5",
"EndpointID": "d01f25e904fbb8acedc7fb639ec52b626fc9b87e3451b91f4d255fd0cc9f661c",
"Gateway": "172.23.0.1",
"IPAddress": "172.23.0.3",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:17:00:03",
"DriverOpts": null
}
}
},
"Mounts": []
},
{
"Id": "a83df3c8a52eba40c57ba3cf6b9d56a842633f562f5b45ab87dd650fb2a27744",
"Names": [
"/blah_blah2_1"
],
"Image": "blah_blah2",
"ImageID": "sha256:9b48aedb1fa786e0dceadfbf2dd5ac9c8f5119eb4afeab8de7f16b6d4178f4f8",
"Command": "sleep 1200",
"Created": 1534910550,
"Ports": [],
"Labels": {
"com.buildkite.sockguard.owner": "sockguard-pid-1",
"com.docker.compose.config-hash": "9864342fe990b836ed15b33259e44c8ad1d82fd968497d022ebe8e02dfe26f41",
"com.docker.compose.container-number": "1",
"com.docker.compose.oneoff": "False",
"com.docker.compose.project": "blah",
"com.docker.compose.service": "blah2",
"com.docker.compose.version": "1.22.0"
},
"State": "running",
"Status": "Up 2 minutes",
"HostConfig": {
"NetworkMode": "blah_default"
},
"NetworkSettings": {
"Networks": {
"blah_default": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "e1ce451ca042fc00e700308fb2ff503da9efc3cbae0c6a6617fa4801efd7b2e5",
"EndpointID": "db3bc70d9c4fac21787d9be14611e01f099437a2618cced3e6a01008bdbb8728",
"Gateway": "172.23.0.1",
"IPAddress": "172.23.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:17:00:02",
"DriverOpts": null
}
}
},
"Mounts": []
}
]
The same API call hitting the unguarded socket:
[nathan@ns-desktop-ub cgroup_parent (master)]$ curl -s --unix-socket /var/run/docker.sock "http:/v1.22/containers/json?limit=-1&all=0&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Dblah%22%2C+%22com.docker.compose.oneoff%3DTrue%22%5D%7D"
[]
Issue lies somewhere in this block: https://github.com/buildkite/sockguard/blob/master/director.go#L334-L341
I might add test coverage around that function to help diagnose the exact issue :)
Yeah, I've been trying to decide how best to add tests for this stuff. It definitely needs it.
It's actually in here: https://github.com/buildkite/sockguard/blob/master/director.go#L450
Trying to remember how it all works now 😅
Verified locally (after merging haha), that this all looks good with https://github.com/buildkite/sockguard/pull/27
root@598b0ae3006a:/blah# docker-compose ps
Name Command State Ports
----------------------------------------
blah_blah2_1 sleep 300 Up
blah_blah_1 sleep 300 Up
Hey @lox have you ever experienced seeing 2 of each container in any "list container" / "ps" operations? Have seen it a few times, but haven't debugged deep yet to see if it's a bug or something else weird.