kurtosis-tech / kurtosis

A platform for packaging and launching ephemeral backend stacks with a focus on approachability for the average developer.
https://docs.kurtosistech.com/
Apache License 2.0
319 stars 42 forks source link

Hanging docker network can't be deleted by kurtosis #1570

Closed galenmarchetti closed 8 months ago

galenmarchetti commented 9 months ago

What's your CLI version?

0.84.7

Description & steps to reproduce

I was running a bunch of kurtosis enclaves and switching between docker and kubernetes, and I got an empty enclave that Kurtosis can't actually delete.

kurtosis enclave rm -f full-geyser gives me:

➜  ~ kurtosis enclave rm full-geyser -f
INFO[2023-10-16T15:15:29-06:00] Destroying enclaves...
Error:  An error occurred running command 'rm'
  Caused by: An error occurred calling the run function for command 'rm'
  Caused by: One or more errors occurred destroying the enclaves:
  >>>>>>>>>>>>>>>>> full-geyser <<<<<<<<<<<<<<<<<
  An error occurred destroying enclave 'full-geyser'
  Caused by: An error occurred destroying enclave with identifier 'full-geyser'
  Caused by: rpc error: code = Unknown desc = An error occurred destroying enclave with identifier 'full-geyser':
  Caused by: An error occurred removing enclave network with ID '0bd35fbf2c1bca64456c6f2fcd536827f7bd6f7cf27bdb871d2f88714a39d479'
  Caused by: An error occurred removing the Docker network with ID 0bd35fbf2c1bca64456c6f2fcd536827f7bd6f7cf27bdb871d2f88714a39d479
  Caused by: Error response from daemon: error while removing network: network kt-full-geyser id 0bd35fbf2c1bca64456c6f2fcd536827f7bd6f7cf27bdb871d2f88714a39d479 has active endpoints
➜  ~

docker network inspect for the docker network looks like this:

[
    {
        "Name": "kt-full-geyser",
        "Id": "0bd35fbf2c1bca64456c6f2fcd536827f7bd6f7cf27bdb871d2f88714a39d479",
        "Created": "2023-10-16T20:36:43.25540393Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.16.4.0/22",
                    "Gateway": "172.16.4.4"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {},
        "Options": {
            "com.docker.network.driver.mtu": "1440"
        },
        "Labels": {
            "com.kurtosistech.app-id": "kurtosis",
            "com.kurtosistech.enclave-creation-time": "2023-10-16T20:36:43Z",
            "com.kurtosistech.enclave-id": "d09200057a994517b8ceaf47f24fd0ef",
            "com.kurtosistech.enclave-name": "full-geyser",
            "com.kurtosistech.guid": "d09200057a994517b8ceaf47f24fd0ef",
            "com.kurtosistech.id": "d09200057a994517b8ceaf47f24fd0ef",
            "enclave_uuid": "d09200057a994517b8ceaf47f24fd0ef",
            "service_name": "d09200057a994517b8ceaf47f24fd0ef",
            "service_short_uuid": "d09200057a99",
            "service_uuid": "d09200057a994517b8ceaf47f24fd0ef"
        }
    }
]

Desired behavior

Always able to rm enclave

What is the severity of this bug?

Papercut; this bug is frustrating, but I have a workaround.

What area of the product does this pertain to?

CLI: the Command Line Interface

mieubrisse commented 9 months ago

This is very strange to me at first glance; I have no idea why this would occur. Normally when Docker rejects network deletion, it's because there are containers still on the network and the network inspect shows some containers (endpoints) still attached. I'm not seeing that here though

mieubrisse commented 9 months ago

@galenmarchetti can you grab the following:

  1. docker container ls -a output
  2. docker container inspect output for the logs-aggregator contianer? (it's my only guess for what might be holding the network)
galenmarchetti commented 9 months ago
➜  ~ docker container ls -a
CONTAINER ID   IMAGE                           COMMAND                  CREATED          STATUS          PORTS                                                      NAMES
1f4c9c69f3d0   kurtosistech/engine:0.84.7      "/bin/sh -c ./kurtos…"   22 minutes ago   Up 22 minutes   0.0.0.0:8081->8081/tcp, 0.0.0.0:9710-9711->9710-9711/tcp   kurtosis-engine--628356f6c7c74993962cbcafbc615665
70afeb4f55bc   timberio/vector:0.31.0-debian   "/bin/sh -c 'printf …"   22 minutes ago   Up 22 minutes                                                              kurtosis-logs-aggregator
dc674c27a0dc   moby/buildkit:buildx-stable-1   "buildkitd"              7 days ago       Up 7 days                                                                  buildx_buildkit_kurtosis-docker-builder0
➜  ~
galenmarchetti commented 9 months ago
➜  ~ docker container inspect 70afeb4f55bc
[
    {
        "Id": "70afeb4f55bc5eb83fc63a6d0b4ec856ecc8f23ff747c8c70138a37426232bb0",
        "Created": "2023-10-16T21:05:59.280441131Z",
        "Path": "/bin/sh",
        "Args": [
            "-c",
            "printf '\n[sources.\"fluent_bit\"]\ntype = \"fluent\"\naddress = \"0.0.0.0:9714\"\n\n[sinks.uuid_file]\ntype = \"file\"\ninputs = [\"fluent_bit\"]\npath = \"/var/log/kurtosis/%%Y/%%V/{{ enclave_uuid }}/{{ service_uuid }}.json\"\t\nencoding.codec = \"json\"\nbuffer.when_full = \"block\"\n\n[sinks.name_file]\ntype = \"file\"\ninputs = [\"fluent_bit\"]\npath = \"/var/log/kurtosis/%%Y/%%V/{{ enclave_uuid }}/{{ service_name }}.json\"\t\nencoding.codec = \"json\"\nbuffer.when_full = \"block\"\n\n[sinks.short_uuid_file]\ntype = \"file\"\ninputs = [\"fluent_bit\"]\npath = \"/var/log/kurtosis/%%Y/%%V/{{ enclave_uuid }}/{{ service_short_uuid }}.json\"\t\nencoding.codec = \"json\"\nbuffer.when_full = \"block\"\n' \u003e /etc/vector/vector.toml \u0026\u0026 /usr/bin/vector -c=/etc/vector/vector.toml"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 82900,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2023-10-16T21:05:59.417488673Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:ec83552fc89eacc0af3a4d1576523a74f8b31b5931aa560b90423dcd3c6e1355",
        "ResolvConfPath": "/var/lib/docker/containers/70afeb4f55bc5eb83fc63a6d0b4ec856ecc8f23ff747c8c70138a37426232bb0/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/70afeb4f55bc5eb83fc63a6d0b4ec856ecc8f23ff747c8c70138a37426232bb0/hostname",
        "HostsPath": "/var/lib/docker/containers/70afeb4f55bc5eb83fc63a6d0b4ec856ecc8f23ff747c8c70138a37426232bb0/hosts",
        "LogPath": "/var/lib/docker/containers/70afeb4f55bc5eb83fc63a6d0b4ec856ecc8f23ff747c8c70138a37426232bb0/70afeb4f55bc5eb83fc63a6d0b4ec856ecc8f23ff747c8c70138a37426232bb0-json.log",
        "Name": "/kurtosis-logs-aggregator",
        "RestartCount": 0,
        "Driver": "overlay2",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "kurtosis-logs-storage:/var/log/kurtosis/"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "default",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "on-failure",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "ConsoleSize": [
                0,
                0
            ],
            "CapAdd": [],
            "CapDrop": null,
            "CgroupnsMode": "private",
            "Dns": null,
            "DnsOptions": null,
            "DnsSearch": null,
            "ExtraHosts": [],
            "GroupAdd": null,
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": null,
            "DeviceCgroupRules": null,
            "DeviceRequests": null,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": null,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ],
            "Init": false
        },
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/e8e1f402cf72f1416240a72735a3105519c85f7e85680b82335985f69cbee8a1-init/diff:/var/lib/docker/overlay2/51359c54ca59290e61be869989ab047a4c417637c3b0ff60e34b950554f83e3b/diff:/var/lib/docker/overlay2/a2f72f522487e0d4f514ef2b6622188eb02d8c5261618cf226e47799620a2868/diff:/var/lib/docker/overlay2/648919e60bc001d2a3889e17eaddd21356d50bfe4867af9d42ac53ff0d9fb748/diff:/var/lib/docker/overlay2/a57f8490382ac780fff4bbb996798ea64ce7e7d2d99bc2937b241e238eeed2d9/diff:/var/lib/docker/overlay2/e98f187408dc6d691ff6fec9a713bf0b02d8fbeb22aa104adb131464fb1160a3/diff:/var/lib/docker/overlay2/44dc2a2fd0bd5d95676d7964bf8f19f12e024b3d9ff244cf878af875008b8f17/diff:/var/lib/docker/overlay2/621ed5350c95a4cd9f37af4646063b6c2351358560e6add5495a1be786c08469/diff",
                "MergedDir": "/var/lib/docker/overlay2/e8e1f402cf72f1416240a72735a3105519c85f7e85680b82335985f69cbee8a1/merged",
                "UpperDir": "/var/lib/docker/overlay2/e8e1f402cf72f1416240a72735a3105519c85f7e85680b82335985f69cbee8a1/diff",
                "WorkDir": "/var/lib/docker/overlay2/e8e1f402cf72f1416240a72735a3105519c85f7e85680b82335985f69cbee8a1/work"
            },
            "Name": "overlay2"
        },
        "Mounts": [
            {
                "Type": "volume",
                "Name": "kurtosis-logs-storage",
                "Source": "/var/lib/docker/volumes/kurtosis-logs-storage/_data",
                "Destination": "/var/log/kurtosis",
                "Driver": "local",
                "Mode": "z",
                "RW": true,
                "Propagation": ""
            }
        ],
        "Config": {
            "Hostname": "70afeb4f55bc",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": true,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ],
            "Cmd": [
                "-c",
                "printf '\n[sources.\"fluent_bit\"]\ntype = \"fluent\"\naddress = \"0.0.0.0:9714\"\n\n[sinks.uuid_file]\ntype = \"file\"\ninputs = [\"fluent_bit\"]\npath = \"/var/log/kurtosis/%%Y/%%V/{{ enclave_uuid }}/{{ service_uuid }}.json\"\t\nencoding.codec = \"json\"\nbuffer.when_full = \"block\"\n\n[sinks.name_file]\ntype = \"file\"\ninputs = [\"fluent_bit\"]\npath = \"/var/log/kurtosis/%%Y/%%V/{{ enclave_uuid }}/{{ service_name }}.json\"\t\nencoding.codec = \"json\"\nbuffer.when_full = \"block\"\n\n[sinks.short_uuid_file]\ntype = \"file\"\ninputs = [\"fluent_bit\"]\npath = \"/var/log/kurtosis/%%Y/%%V/{{ enclave_uuid }}/{{ service_short_uuid }}.json\"\t\nencoding.codec = \"json\"\nbuffer.when_full = \"block\"\n' \u003e /etc/vector/vector.toml \u0026\u0026 /usr/bin/vector -c=/etc/vector/vector.toml"
            ],
            "Image": "timberio/vector:0.31.0-debian",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": [
                "/bin/sh"
            ],
            "OnBuild": null,
            "Labels": {
                "com.kurtosistech.app-id": "kurtosis",
                "com.kurtosistech.container-type": "kurtosis-logs-aggregator"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "8fcc65f596f1c6db663f5902a0d7d7dc92d8a03fc39be4643cfcd18b5242e216",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/8fcc65f596f1",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "05bc6d3d33495e883bee1f454506cec677564a93eeb0a91a6b5f9a40b6d4c2f8",
            "Gateway": "172.17.0.1",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "172.17.0.3",
            "IPPrefixLen": 16,
            "IPv6Gateway": "",
            "MacAddress": "02:42:ac:11:00:03",
            "Networks": {
                "bridge": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "7593c931428d2550f0cabfd14f4d26089477a01cb648f9ff9d182887a7e472ce",
                    "EndpointID": "05bc6d3d33495e883bee1f454506cec677564a93eeb0a91a6b5f9a40b6d4c2f8",
                    "Gateway": "172.17.0.1",
                    "IPAddress": "172.17.0.3",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:ac:11:00:03",
                    "DriverOpts": null
                }
            }
        }
    }
]
➜  ~
mieubrisse commented 9 months ago

What's your docker version @galenmarchetti ?

laurentluce commented 9 months ago

@mieubrisse Should we close this ticket based on this discussion?

leeederek commented 8 months ago

Closing as per discussion above^, referencing this issue as well: https://github.com/moby/moby/issues/42119