devcontainers / features

A collection of Dev Container Features managed by Dev Container spec maintainers. See https://github.com/devcontainers/feature-starter to publish your own
https://containers.dev/features
MIT License
914 stars 367 forks source link

docker-in-docker fails locally on linux (works in codespaces) #285

Closed Chuxel closed 1 year ago

Chuxel commented 1 year ago

Migrated from https://github.com/microsoft/vscode-dev-containers/issues/1687

From @dsyer


I have a fairly vanilla Ubuntu host which runs docker happily. When I start a devcontainer locally with docker-in-docker it fails. JSON:

{
    "name": "hello",
    "image": "mcr.microsoft.com/devcontainers/base:ubuntu",
    "remoteUser": "vscode",
    "features": {
        "ghcr.io/devcontainers/features/docker-in-docker:1": {}
    }
}

Error:

$ docker ps
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

The same JSON works in codespaces:

$ docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
Chuxel commented 1 year ago

@dsyer Are you by chance running rootless? If so, I believe the needed --privileged flag will not work.

Otherwise, what are the contents of /tmp/dockerd.log?

dsyer commented 1 year ago

What is "rootless"? I tried with and without --privileged and with and without "overrideCommand": false and neither seemed to help. It's a really simple JSON:

{
    "name": "test",
    "image": "mcr.microsoft.com/vscode/devcontainers/base:focal",
    "runArgs": ["--init", "--privileged"],
    "overrideCommand": false,
    "extensions": [
        "mhutchie.git-graph"
    ],
    "features": {
        "ghcr.io/devcontainers/features/docker-in-docker:1": {}
    }
}

what are the contents of /tmp/dockerd.log?

A load of timeouts, and complaints that containerd is not running.

$ cat /tmp/dockerd.log 
time="2022-11-14T17:32:30.856643944Z" level=info msg="Starting up"
time="2022-11-14T17:32:30.857641265Z" level=info msg="libcontainerd: started new containerd process" pid=27
time="2022-11-14T17:32:30.857665822Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2022-11-14T17:32:30.857671524Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2022-11-14T17:32:30.857684476Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
time="2022-11-14T17:32:30.857691635Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2022-11-14T17:32:30Z" level=warning msg="containerd config version `1` has been deprecated and will be removed in containerd v2.0, please switch to version `2`, see https://github.com/containerd/containerd/blob/main/docs/PLUGINS.md#version-header"
time="2022-11-14T17:32:30.871946436Z" level=info msg="starting containerd" revision=b84d0b151c2395a5917996d602b192ce1e0fa461 version=1.5.14+azure-1
time="2022-11-14T17:32:30.895776327Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
time="2022-11-14T17:32:30.896128132Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
time="2022-11-14T17:32:30.896301691Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." type=io.containerd.snapshotter.v1
time="2022-11-14T17:32:30.896502215Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
time="2022-11-14T17:32:30.896522056Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
time="2022-11-14T17:32:30.896547950Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
time="2022-11-14T17:32:30.896560302Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
time="2022-11-14T17:32:30.896590758Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
time="2022-11-14T17:32:30.896687713Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
time="2022-11-14T17:32:30.896853854Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
time="2022-11-14T17:32:30.896871952Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
time="2022-11-14T17:32:30.896890707Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
time="2022-11-14T17:32:30.896901849Z" level=info msg="metadata content store policy set" policy=shared
time="2022-11-14T17:32:31.858908292Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
time="2022-11-14T17:32:34.393595405Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
time="2022-11-14T17:32:38.294180651Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
Chuxel commented 1 year ago

How was Docker installed? Is this Docker Desktop for Linux, the official docker engine install via their apt repo? Something else?

Re: rootless, this is what I am referring to https://docs.docker.com/engine/security/rootless

Haven't been able to repro, but it could be something about how Docker was installed. Trying to narrow down possibilities.

dsyer commented 1 year ago

I remember rootless now. I did try it out once and it couldn’t support some of the things I do so I ditched it. I’m on a pretty vanilla Ubuntu (20.04) with docker installed from the Ubuntu package manager. It usually works, and I have definitely used docker in devcontainers before this. So something may have changed, or this new “feature” isn’t the same way I installed it before.

Chuxel commented 1 year ago

🤔 The same script was used as a base. Hmmm. You can try using ghcr.io/devcontainers/features/docker-in-docker:1.0.4 which is before some recent updates to support different DNS options in the event that caused a problem.

edouard-lopez commented 1 year ago

I have the same issue after forking the https://github.com/devcontainers/feature-starter/tree/c23b8feef8bcb54a8d783043b9269d34f5172b1e.

Context

OS

❯ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.10
Release:        22.10
Codename:       kinetic

Docker

❯ docker version 
Client: Docker Engine - Community
 Version:           20.10.21
 API version:       1.41
 Go version:        go1.18.7
 Git commit:        baeda1f
 Built:             Tue Oct 25 18:02:14 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.21
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.7
  Git commit:       3056208
  Built:            Tue Oct 25 18:00:01 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.9
  GitCommit:        1c90a442489720eec95342e1789ee8a5e1b9536f
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

VSCode

Version: 1.73.1
Commit: 6261075646f055b99068d3688932416f2346dd3b
Date: 2022-11-09T03:54:53.913Z
Electron: 19.0.17
Chromium: 102.0.5005.167
Node.js: 16.14.2
V8: 10.2.154.15-electron.0
OS: Linux x64 5.19.0-23-generic
Sandboxed: No
dsyer commented 1 year ago

I changed the host and container to Ubuntu jammy (22.04) and it seems to work that way (when the timeout in https://github.com/devcontainers/cli/issues/281 doesn't happen, which makes it hard to test).

edouard-lopez commented 1 year ago

Thanks, @dsyer, for the update. For me, it happens on my desktop machine, downgrading the OS is not really an option.

samruddhikhandale commented 1 year ago

Chatted with @edouard-lopez, the issue he is facing looks like a remote containers one to me. Created an issue with details - https://github.com/microsoft/vscode-remote-release/issues/7605

Chuxel commented 1 year ago

Yeah, as discussed, this could be due to two dev containers attempting to use Docker-in-Docker due to https://github.com/devcontainers/features/issues/248 not being done yet.

If you want to see if this resolves the issue in the mean time, I patched and published an update elsewhere with the fix.

"ghcr.io/chuxel/feature-library/dind-patched:2": {}
dsyer commented 1 year ago

That works, thanks. So it was nothing to do with the jammy upgrade probably (jammy still broken for me with the official docker-in-docker feature).

samruddhikhandale commented 1 year ago

https://github.com/devcontainers/features/issues/248 is closed as fixed with https://github.com/devcontainers/features/pull/314

Update is released to the docker-in-docker Feature which fixes failures on a local machine. 🚀

ℹ️ Bumped the major version to 2 as it could be a breaking change for existing volumes

"ghcr.io/devcontainers/features/docker-in-docker:2": {},

Feel free to re-open if the bug resurfaces, thanks! 😄

kamal2311 commented 1 year ago

@samruddhikhandale can this be please fixed in docker-outside-of-docker also? This issue still exists there.

samruddhikhandale commented 1 year ago

@kamal2311 We don't volume mount in case of docker-outside-of-docker, see https://github.com/devcontainers/features/blob/main/src/docker-outside-of-docker/devcontainer-feature.json#L47-L53. Hence, I don't expect ^ it to cause any issues.

Docker-outside-of-docker is a different model, where the host’s docker socket is bind mounted into the container, so docker commands executed from inside the container actually execute on the host. I wonder if you are facing trouble with Docker on your host machine, feel free to open a new issue if required.