mal / docker-for-mac-host-bridge

Host-accessible containers with Docker for Mac
125 stars 23 forks source link

interface tap1 does not exist #2

Closed digitalstaub closed 2 years ago

digitalstaub commented 7 years ago

Hello, after rebooting macos(10.12.4) the error 'interface tap1 does not exist' appears. The kext is loaded and the device has the privileges for the current user. manually "ifconfig tap1 up" results in the same error. I don't know, how to regain a working tap0 device (except resetting docker, do install.sh and rebuild all images)

mal commented 7 years ago

First thing I'd suggest is trying a reinstall of tuntaposx:

  1. Uninstall tuntaposx

    sudo kextunload /Library/Extensions/tap.kext
    sudo kextunload /Library/Extensions/tun.kext
    
    sudo rm -rv /Library/Extensions/tap.kext \
                /Library/Extensions/tun.kext \
                /Library/StartupItems/tap \
                /Library/StartupItems/tun
  2. Reboot macOS
  3. Follow the install instructions in the README again

This is definitely more of a "turn it off and on again" approach than I'd like, but it should hopefully allow you to recover a working tap1 device and not have to reset docker or repull/rebuild your images.

inancgumus commented 7 years ago

Hi @mal. Using the above recipe didn't work for me. I'm seeing the same error message as @digitalstaub :-(

Should I add tap1 interface myself and how? What do you think?

mal commented 7 years ago

The way tap interfaces (in our case tap1) work is that they only "exist" and are available to ifconfig once an application (in our case Docker) opens the corresponding character device (in our case /dev/tap1), and they stop "existing" when the application closes their device.

If you can see /dev/tap1 in your filesystem and the permissions are correct (it's owned by the same user you run Docker with), then the next thing to check is that Docker is actually using it.

ps -ef | grep 'virtio-tap' # this looks for the modified CLI options set by the shim

If Docker is running and you see no results, it suggests either that the shim has not been installed or that Docker was not restarted when the script prompted that action. If you do see results we start to get into murky grey areas, so I'll leave it there and wait to see the outcome of that check.

inancgumus commented 7 years ago

@mal Thx mal.

It seems like the shim modified it (however I don't know how to rollback yet):

$ ps -ef | grep --color 'virtio-tap'
  501  8768  8763   0  8:42PM ??         0:06.04 
/Applications/Docker.app/Contents/MacOS/com.docker.hyperkit.real -A -m 4096M -c 4 -u -s 0:0,hostbridge -s 31,
lpc -s 2:0,
virtio-vpnkit,
uuid=251e7dab-a8d1-4a8d-8c08-c47a39c88491,
path=/Users/inanc/Library/Containers/com.docker.docker/Data/s50,
macfile=/Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/mac.0 -s 2:1,
virtio-tap,
tap1 -s 3,
virtio-blk,file:///Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2?sync=drive&buffered=1,format=qcow,
qcow-config=discard=false;compact_after_unmaps=0 -s 4,
virtio-9p,path=/Users/inanc/Library/Containers/com.docker.docker/Data/s40,
tag=db -s 5,
virtio-rnd -s 6,
virtio-9p,path=/Users/inanc/Library/Containers/com.docker.docker/Data/s51,tag=port -s 7,
virtio-sock,guest_cid=3,path=/Users/inanc/Library/Containers/com.docker.docker/Data,
guest_forwards=2376;1525 -l com1,
autopty=/Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty,
log=/Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/console-ring -f kexec,
/Applications/Docker.app/Contents/Resources/moby/vmlinuz64,
/Applications/Docker.app/Contents/Resources/moby/initrd.img,
earlyprintk=serial console=ttyS0 com.docker.driver="com.docker.driver.amd64-linux", com.docker.database="com.docker.driver.amd64-linux" ntp=gateway mobyplatform=mac vsyscall=emulate page_poison=1 panic=1 -F /Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/hypervisor.pid

And, taps are there too:

$ ls /dev/tap*
/dev/tap0  /dev/tap10 /dev/tap12 /dev/tap14 /dev/tap2  /dev/tap4  /dev/tap6  /dev/tap8
/dev/tap1  /dev/tap11 /dev/tap13 /dev/tap15 /dev/tap3  /dev/tap5  /dev/tap7  /dev/tap9

However, they belong to user root (and group staff). That's not my username.

# this belongs to root
$ ps -u root | grep -i --color docker
    0    99 ??         0:00.01 /Library/PrivilegedHelperTools/com.docker.vmnetd

# this belongs to me
$ ps -u inanc | grep -i --color docker
  501  4896 ??         0:05.30 /Applications/Docker.app/Contents/MacOS/Docker
  501  4918 ??         0:00.08 /Applications/Docker.app/Contents/MacOS/com.docker.osx.hyperkit.linux -watchdog fd:0 -max-restarts 5 -restart-seconds 30
  501  4919 ??         0:00.06 /Applications/Docker.app/Contents/MacOS/com.docker.osx.hyperkit.linux -watchdog fd:0 -max-restarts 5 -restart-seconds 30
  501  4923 ??         0:01.36 com.docker.db --url fd://3 --git /Users/inanc/Library/Containers/com.docker.docker/Data/database
  501  4926 ??         0:00.06 com.docker.osx.hyperkit.linux
  501  4928 ??         0:00.01 /Applications/Docker.app/Contents/MacOS/com.docker.osx.hyperkit.linux
  501  8761 ??         0:00.04 com.docker.osxfs --address fd:3 --connect /Users/inanc/Library/Containers/com.docker.docker/Data/@connect --control fd:4 --volume-control fd:5 --database /Users/inanc/Library/Containers/com.docker.docker/Data/s40
  501  8762 ??         0:00.04 com.docker.slirp --db /Users/inanc/Library/Containers/com.docker.docker/Data/s40 --ethernet fd:3 --port fd:4 --introspection fd:5 --diagnostics fd:6 --vsock-path /Users/inanc/Library/Containers/com.docker.docker/Data/@connect
  501  8763 ??         0:00.10 com.docker.driver.amd64-linux -db /Users/inanc/Library/Containers/com.docker.docker/Data/s40 -osxfs-volume /Users/inanc/Library/Containers/com.docker.docker/Data/s30 -slirp /Users/inanc/Library/Containers/com.docker.docker/Data/s50 -vmnet /var/tmp/com.docker.vmnetd.socket -port /Users/inanc/Library/Containers/com.docker.docker/Data/s51 -vsock /Users/inanc/Library/Containers/com.docker.docker/Data -docker /Users/inanc/Library/Containers/com.docker.docker/Data/s60 -addr fd:3 -debug
  501  8764 ??         0:00.02 /Applications/Docker.app/Contents/MacOS/com.docker.driver.amd64-linux -db /Users/inanc/Library/Containers/com.docker.docker/Data/s40 -osxfs-volume /Users/inanc/Library/Containers/com.docker.docker/Data/s30 -slirp /Users/inanc/Library/Containers/com.docker.docker/Data/s50 -vmnet /var/tmp/com.docker.vmnetd.socket -port /Users/inanc/Library/Containers/com.docker.docker/Data/s51 -vsock /Users/inanc/Library/Containers/com.docker.docker/Data -docker /Users/inanc/Library/Containers/com.docker.docker/Data/s60 -addr fd:3 -debug
  501  8768 ??         0:08.88 /Applications/Docker.app/Contents/MacOS/com.docker.hyperkit.real -A -m 4096M -c 4 -u -s 0:0,hostbridge -s 31,lpc -s 2:0,virtio-vpnkit,uuid=251e7dab-a8d1-4a8d-8c08-c47a39c88491,path=/Users/inanc/Library/Containers/com.docker.docker/Data/s50,macfile=/Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/mac.0 -s 2:1,virtio-tap,tap1 -s 3,virtio-blk,file:///Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2?sync=drive&buffered=1,format=qcow,qcow-config=discard=false;compact_after_unmaps=0 -s 4,virtio-9p,path=/Users/inanc/Library/Containers/com.docker.docker/Data/s40,tag=db -s 5,virtio-rnd -s 6,virtio-9p,path=/Users/inanc/Library/Containers/com.docker.docker/Data/s51,tag=port -s 7,virtio-sock,guest_cid=3,path=/Users/inanc/Library/Containers/com.docker.docker/Data,guest_forwards=2376;1525 -l com1,autopty=/Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty,log=/Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/console-ring -f kexec,/Applications/Docker.app/Contents/Resources/moby/vmlinuz64,/Applications/Docker.app/Contents/Resources/moby/initrd.img,earlyprintk=serial console=ttyS0 com.docker.driver="com.docker.driver.amd64-linux", com.docker.database="com.docker.driver.amd64-linux" ntp=gateway mobyplatform=mac vsyscall=emulate page_poison=1 panic=1 -F /Users/inanc/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/hypervisor.pid
  501  8992 ttys004    0:00.00 grep -i --color docker

Other than these, the script never asked me to restart the Docker, I didn't reach that stage yet.

The output of the script as follows:

# That's why they belong to root, because of sudo. However, 
# I also had ran it without sudo, the output was the same
$ sudo ./install.sh tuntap_20150118.pkg
Install tuntap kernel extension
installer: Package name is TunTap Installer package
installer: Upgrading at base path /
installer: The upgrade was successful.
Ensure tap extension is loaded
Create host-accessible network
f42d60cec7f4b19c1c8d71c81d26c22834bcb94cc5a0de6cba77b8d3fbe3a334
Bridge tap into docker network
Assign the network gateway IP to the tap interface
ifconfig: interface tap1 does not exist

And, sometimes like this:

$ ./install.sh tuntap_20150118.pkg
Install tuntap kernel extension
installer: Package name is TunTap Installer package
installer: Upgrading at base path /
installer: The upgrade was successful.
Ensure tap extension is loaded
Permit non-root usage of tap1 device
Create host-accessible network
Error response from daemon: cannot create network 73cf805bd08ba0a4706a31bac93385f5edcb22e99630e233bc74179b35ec6efb (br-tap1): conflicts with network 8f74c89425275e6e274b0615a5a4a0f48f1e64805ce1d9c51d570fab13b21608 (br-tap1): networks have same bridge name
inancgumus commented 7 years ago

I stopped Docker, removed everything, restarted the computer, and then tried again, it failed again :(

$ ./install.sh tuntap_20150118.pkg
Install tuntap kernel extension
Password:
installer: Package name is TunTap Installer package
installer: Upgrading at base path /
installer: The upgrade was successful.
Ensure tap extension is loaded
Permit non-root usage of tap1 device
Create host-accessible network
4e1ea4b02c155af6766f03ea39c92151483a85abed407a1b15a559ef6574fb32
Bridge tap into docker network
Assign the network gateway IP to the tap interface
ifconfig: interface tap1 does not exist

# However, taps are there as I posted before:
$ ls /dev/tap*
/dev/tap0  /dev/tap10 /dev/tap12 /dev/tap14 /dev/tap2  /dev/tap4  /dev/tap6  /dev/tap8
/dev/tap1  /dev/tap11 /dev/tap13 /dev/tap15 /dev/tap3  /dev/tap5  /dev/tap7  /dev/tap9
mal commented 7 years ago

It's interesting that install.sh tries to install the driver each time. It has a check to skip that if it's already done.

Could you do ls -l /dev/tap*?

inancgumus commented 7 years ago

Hmm, interesting, only tap1 is belong to me, others are belong to root.

$ ls -l /dev/tap*
crw-rw----  1 root   wheel   37,   0 May 10 21:23 /dev/tap0
crw-rw----  1 inanc  wheel   37,   1 May 10 21:23 /dev/tap1
crw-rw----  1 root   wheel   37,  10 May 10 21:23 /dev/tap10
crw-rw----  1 root   wheel   37,  11 May 10 21:23 /dev/tap11
crw-rw----  1 root   wheel   37,  12 May 10 21:23 /dev/tap12
crw-rw----  1 root   wheel   37,  13 May 10 21:23 /dev/tap13
crw-rw----  1 root   wheel   37,  14 May 10 21:23 /dev/tap14
crw-rw----  1 root   wheel   37,  15 May 10 21:23 /dev/tap15
crw-rw----  1 root   wheel   37,   2 May 10 21:23 /dev/tap2
crw-rw----  1 root   wheel   37,   3 May 10 21:23 /dev/tap3
crw-rw----  1 root   wheel   37,   4 May 10 21:23 /dev/tap4
crw-rw----  1 root   wheel   37,   5 May 10 21:23 /dev/tap5
crw-rw----  1 root   wheel   37,   6 May 10 21:23 /dev/tap6
crw-rw----  1 root   wheel   37,   7 May 10 21:23 /dev/tap7
crw-rw----  1 root   wheel   37,   8 May 10 21:23 /dev/tap8
crw-rw----  1 root   wheel   37,   9 May 10 21:23 /dev/tap9
mal commented 7 years ago

Looks like the ownership is being set correctly by install.sh (so that Docker can read and write to it). What is interesting however is the major device number is 37. On my test system it's 38. More curious still is that despite /dev/tap1 being a character device (as denoted by the first character in the line being c) the check install.sh does (test -c /dev/tap1) appears to be returning false.

If we proceed (optimistically) under the assumption that the tap devices are installed correctly and it's the test that's bad, then the next thing to try is:

  1. Remove this line from install.sh
  2. Run install.sh again (failure expected)
  3. Restart Docker via the tray
  4. Run install.sh again (hopefully success)

At this point it's worth checking too which version of Docker you're using in case we're over-complicating this, but my suspicion at this point is that everything is more or less correct, but the repeated attempts to install the tuntaposx driver when it's already installed cause Docker to be disconnected from it. The steps above are designed to mitigate that, and if successful a more general solution can be sought out. Fingers crossed!

inancgumus commented 7 years ago

Why the device numbers are changing looks weird to me. I don't know why.

Result for the first two steps: Remove the line & Run again

# commented the 16th line: # install_tuntap_driver $1
$ ./install.sh tuntap_20150118.pkg
Bridge tap into docker network
Assign the network gateway IP to the tap interface
Password:
ifconfig: interface tap1 does not exist
# seen the expected failure :)

Result for the remaining steps:

# after Docker restart from tray
$ ./install.sh tuntap_20150118.pkg
Bridge tap into docker network
Assign the network gateway IP to the tap interface
Password:
# no more output
$

Docker version:

Client:
 Version:      17.03.1-ce
 API version:  1.27
 Go version:   go1.7.5
 Git commit:   c6d412e
 Built:        Tue Mar 28 00:40:02 2017
 OS/Arch:      darwin/amd64

Server:
 Version:      17.03.1-ce
 API version:  1.27 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   c6d412e
 Built:        Fri Mar 24 00:00:50 2017
 OS/Arch:      linux/amd64
 Experimental: true

What do you think?

mal commented 7 years ago

The password prompt followed by no error message suggests to me that it may have completed successfully. The best way to test this would be to be try and ping the IP of one of your containers that has been started using the linked up network.

Update: You should also be able to see the tap1 interface in the output when running ifconfig.

inancgumus commented 7 years ago

tap1 bound to 172.18.0.1, so I pinged it, it responded. However, as I talked more about it in this issue, I still can't reach the kafka service inside its container.

What it could be the problem? Am I doing something wrong?

$ telnet 172.18.0.1 29092
Trying 172.18.0.1...
telnet: connect to address 172.18.0.1: Connection refused
telnet: Unable to connect to remote host

# However, I can reach it inside the kafka container:
$ docker exec -it kafkasinglenode_kafka_1 /bin/bash
root@moby:/# curl localhost:29092
curl: (56) Recv failure: Connection reset by peer
$ ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
    options=1203<RXCSUM,TXCSUM,TXSTATUS,SW_TIMESTAMP>
    inet 127.0.0.1 netmask 0xff000000 
    inet6 ::1 prefixlen 128 
    inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 
    nd6 options=201<PERFORMNUD,DAD>
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    ether b8:09:8a:d7:11:fd 
    inet 192.168.1.26 netmask 0xffffff00 broadcast 192.168.1.255
    media: autoselect
    status: active
en2: flags=963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX> mtu 1500
    options=60<TSO4,TSO6>
    ether 0a:00:00:72:67:d0 
    media: autoselect <full-duplex>
    status: inactive
en3: flags=963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX> mtu 1500
    options=60<TSO4,TSO6>
    ether 0a:00:00:72:67:d1 
    media: autoselect <full-duplex>
    status: inactive
p2p0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 2304
    ether 0a:09:8a:d7:11:fd 
    media: autoselect
    status: inactive
awdl0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1484
    ether 8a:ed:7f:09:54:a4 
    inet6 fe80::88ed:7fff:fe09:54a4%awdl0 prefixlen 64 scopeid 0x8 
    nd6 options=201<PERFORMNUD,DAD>
    media: autoselect
    status: active
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    options=10b<RXCSUM,TXCSUM,VLAN_HWTAGGING,AV>
    ether ac:87:a3:29:6b:9c 
    media: autoselect (none)
    status: inactive
bridge0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    options=63<RXCSUM,TXCSUM,TSO4,TSO6>
    ether 0a:00:00:72:67:d0 
    Configuration:
        id 0:0:0:0:0:0 priority 0 hellotime 0 fwddelay 0
        maxage 0 holdcnt 0 proto stp maxaddr 100 timeout 1200
        root id 0:0:0:0:0:0 priority 0 ifcost 0 port 0
        ipfilter disabled flags 0x2
    member: en2 flags=3<LEARNING,DISCOVER>
            ifmaxaddr 0 port 5 priority 0 path cost 0
    member: en3 flags=3<LEARNING,DISCOVER>
            ifmaxaddr 0 port 6 priority 0 path cost 0
    media: <unknown type>
    status: inactive
utun0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 2000
    inet6 fe80::f478:159d:812b:f24b%utun0 prefixlen 64 scopeid 0xb 
    nd6 options=201<PERFORMNUD,DAD>
utun1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1380
    inet6 fe80::a1f0:95ed:f9ba:f43e%utun1 prefixlen 64 scopeid 0xc 
    nd6 options=201<PERFORMNUD,DAD>
utun2: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1380
    inet6 fe80::d88b:6131:7d71:af6e%utun2 prefixlen 64 scopeid 0xd 
    inet6 fdfd:386e:4f72:1c73:d88b:6131:7d71:af6e prefixlen 64 
    nd6 options=201<PERFORMNUD,DAD>
tap1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    ether 6a:37:03:8f:34:20 
    inet 172.18.0.1 netmask 0xffff0000 broadcast 172.18.255.255
    media: autoselect
    status: active
    open (pid 13749)
en7: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
    ether ca:e3:50:b7:90:16 
    media: autoselect
    status: active
bridge100: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    options=3<RXCSUM,TXCSUM>
    ether ba:09:8a:7d:ec:64 
    inet 192.168.64.1 netmask 0xffffff00 broadcast 192.168.64.255
    Configuration:
        id 0:0:0:0:0:0 priority 0 hellotime 0 fwddelay 0
        maxage 0 holdcnt 0 proto stp maxaddr 100 timeout 1200
        root id 0:0:0:0:0:0 priority 0 ifcost 0 port 0
        ipfilter disabled flags 0x2
    member: en7 flags=3<LEARNING,DISCOVER>
            ifmaxaddr 0 port 14 priority 0 path cost 0
    nd6 options=201<PERFORMNUD,DAD>
    media: autoselect
    status: active
mal commented 7 years ago

So the 172.18.0.1 address you see in ifconfig is actually the host's IP on the Docker network. In order to connect to your kafka container you must find it's IP and try connecting to that. You can find it's IP on the network using:

docker container inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' kafkasinglenode_kafka_1
inancgumus commented 7 years ago

Ah, OK. However, the weird part is the command returns nothing:

$ docker container inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' kafkasinglenode_kafka_1
# nothing...

# this is the full output
~/dev/loca...es/kafka-single-node (v3.2.1) inanc@inanc-imac
$ docker container inspect kafkasinglenode_kafka_1
[
    {
        "Id": "03dc537da655361de20e3da437b92d337834cca81234cb1290e69f3bbaecc054",
        "Created": "2017-05-10T19:29:56.387374076Z",
        "Path": "/etc/confluent/docker/run",
        "Args": [],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 2991,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2017-05-10T19:29:56.76121565Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:b8dcd0e0782a1b2b07dd9a50ae5b1b7d9f7be2e40f46eafc90c192f09d29bba1",
        "ResolvConfPath": "/var/lib/docker/containers/03dc537da655361de20e3da437b92d337834cca81234cb1290e69f3bbaecc054/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/03dc537da655361de20e3da437b92d337834cca81234cb1290e69f3bbaecc054/hostname",
        "HostsPath": "/var/lib/docker/containers/03dc537da655361de20e3da437b92d337834cca81234cb1290e69f3bbaecc054/hosts",
        "LogPath": "/var/lib/docker/containers/03dc537da655361de20e3da437b92d337834cca81234cb1290e69f3bbaecc054/03dc537da655361de20e3da437b92d337834cca81234cb1290e69f3bbaecc054-json.log",
        "Name": "/kafkasinglenode_kafka_1",
        "RestartCount": 0,
        "Driver": "overlay2",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "74ad828f4d0d2b178a3f34bc3bc2c20f10a8a9829277474fa1e5faa80053164b:/etc/kafka/secrets:rw",
                "ce76d52c58ae33485dac1065fc4c7c5a18ad9d8bab05923f43eae0c02c81ec6d:/var/lib/kafka/data:rw"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "host",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": [],
            "CapAdd": null,
            "CapDrop": null,
            "Dns": null,
            "DnsOptions": null,
            "DnsSearch": null,
            "ExtraHosts": [
                "moby:127.0.0.1"
            ],
            "GroupAdd": null,
            "IpcMode": "",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": null,
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": -1,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        },
        "GraphDriver": {
            "Name": "overlay2",
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/f8c333c0bae0fe9de47cd0a725ee80f85a9fc472e766a6b336eeef089c93813a-init/diff:/var/lib/docker/overlay2/af5c97c492443498aae9cd5063db4e86d2fdbe94f183dc371be7fc90afdc791c/diff:/var/lib/docker/overlay2/f6015622de2a050c1b80d520521f6c9fc9b95296efc45c6582dac4ce14e6826e/diff:/var/lib/docker/overlay2/a92854dc0831428b94d18dc685bdfa3339d40b1c2961a7e3d4bb59c96609b2f7/diff:/var/lib/docker/overlay2/773443c22662a2b8ced42e626ed48035d08c8979a6bfc105e66e1aa478a1748e/diff:/var/lib/docker/overlay2/8de4875aa64ede306517210b8baa8b71a812a1970c628b8b7ed6be64207af2b5/diff:/var/lib/docker/overlay2/3c04b4c45cf81a61010881724151f39be73c963293f3a3ca4d52d12b371d4c7d/diff:/var/lib/docker/overlay2/172504d2c06a6cde4005ff4ad41ec557be0631a72be23f5d117d85d26399b55d/diff",
                "MergedDir": "/var/lib/docker/overlay2/f8c333c0bae0fe9de47cd0a725ee80f85a9fc472e766a6b336eeef089c93813a/merged",
                "UpperDir": "/var/lib/docker/overlay2/f8c333c0bae0fe9de47cd0a725ee80f85a9fc472e766a6b336eeef089c93813a/diff",
                "WorkDir": "/var/lib/docker/overlay2/f8c333c0bae0fe9de47cd0a725ee80f85a9fc472e766a6b336eeef089c93813a/work"
            }
        },
        "Mounts": [
            {
                "Type": "volume",
                "Name": "74ad828f4d0d2b178a3f34bc3bc2c20f10a8a9829277474fa1e5faa80053164b",
                "Source": "/var/lib/docker/volumes/74ad828f4d0d2b178a3f34bc3bc2c20f10a8a9829277474fa1e5faa80053164b/_data",
                "Destination": "/etc/kafka/secrets",
                "Driver": "local",
                "Mode": "rw",
                "RW": true,
                "Propagation": ""
            },
            {
                "Type": "volume",
                "Name": "ce76d52c58ae33485dac1065fc4c7c5a18ad9d8bab05923f43eae0c02c81ec6d",
                "Source": "/var/lib/docker/volumes/ce76d52c58ae33485dac1065fc4c7c5a18ad9d8bab05923f43eae0c02c81ec6d/_data",
                "Destination": "/var/lib/kafka/data",
                "Driver": "local",
                "Mode": "rw",
                "RW": true,
                "Propagation": ""
            }
        ],
        "Config": {
            "Hostname": "moby",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "9092/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:29092",
                "KAFKA_ZOOKEEPER_CONNECT=localhost:32181",
                "KAFKA_BROKER_ID=1",
                "affinity:container==7db9156e5246c6605b4032cbea36e5560f28f19d63a45a01e9203397c29adb7d",
                "no_proxy=*.local, 169.254/16",
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "APT_ALLOW_UNAUTHENTICATED=false",
                "PYTHON_VERSION=2.7.9-1",
                "PYTHON_PIP_VERSION=8.1.2",
                "SCALA_VERSION=2.11",
                "CONFLUENT_MAJOR_VERSION=3",
                "CONFLUENT_MINOR_VERSION=2",
                "CONFLUENT_PATCH_VERSION=1",
                "CONFLUENT_VERSION=3.2.1",
                "CONFLUENT_DEB_VERSION=1",
                "KAFKA_VERSION=0.10.2.1",
                "ZULU_OPENJDK_VERSION=8=8.17.0.3",
                "LANG=C.UTF-8",
                "COMPONENT=kafka"
            ],
            "Cmd": [
                "/etc/confluent/docker/run"
            ],
            "ArgsEscaped": true,
            "Image": "confluentinc/cp-kafka:latest",
            "Volumes": {
                "/etc/kafka/secrets": {},
                "/var/lib/kafka/data": {}
            },
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {
                "com.docker.compose.config-hash": "ba6f9f7f8828a237cbe32119068e85e2684f191cb5be3a34d4a69a1dc011aeb7",
                "com.docker.compose.container-number": "1",
                "com.docker.compose.oneoff": "False",
                "com.docker.compose.project": "kafkasinglenode",
                "com.docker.compose.service": "kafka",
                "com.docker.compose.version": "1.11.2",
                "io.confluent.docker": "true",
                "io.confluent.docker.build.number": "5",
                "io.confluent.docker.git.id": "7316a75"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "8de5acf08628aa3f3d9981ebd00d0797991514d6e59bbfbbb09db1b14107fee9",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/default",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "host": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "08a81de94b22916b8250be8dcb6a5045793a0843d13a171d144e674d16cc1dcc",
                    "EndpointID": "13701cee041b0e29ef18b52586104fba88dd5777b0de67a47980e938aa3d1aec",
                    "Gateway": "",
                    "IPAddress": "",
                    "IPPrefixLen": 0,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": ""
                }
            }
        }
    }
]
mal commented 7 years ago

Ah, looking at your compose file (from the other issue) you're starting the kafka container using host networking, rather than with the bridge network that this project creates. To use the network set up and linked by this project you need to add something similar to the following to your composition file:

networks:
  default:
    external:
      name: tap # tap being the default name given to the network created with this project

I'm afraid my exposure to docker-compose is almost non-existant, so this is based purely on the documentation available here (at the very bottom).

Update: You would also need to remove network_mode: host

inancgumus commented 7 years ago

Yeah, they use host networking to let containers talk with each other in the host machine's network. host enables the host machine network available to the containers fully. This is all started with that issue, then I found your project through many diggings through forums, github etc. I'm trying to reach the container ports which have been created with Docker network_mode is host from OS X.

I added networks config to docker-compose.yml and then docker-compose up again to start the Docker containers. However, I can't still connect :(

$ docker container inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' kafkasinglenode_kafka_1
# no output

$ telnet 172.18.0.1 9092
Trying 172.18.0.1...
telnet: connect to address 172.18.0.1: Connection refused
telnet: Unable to connect to remote host

$ telnet 172.18.0.1 29092
Trying 172.18.0.1...
telnet: connect to address 172.18.0.1: Connection refused
telnet: Unable to connect to remote host

I also have tried tap1 in docker-compose.yml for the network name. It didn't work either.


In bridge mode IP appears:

$ docker container inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' kafkasinglenode_kafka_1
172.18.0.3
inancgumus commented 7 years ago

Btw, when I ssh into tap1's IP, I see my own computer. Because, 172.18.0.1 is assigned as my host machine's IP.

$ ifconfig | grep -i tap1 -n2
61- inet6 fdfd:386e:4f72:1c73:d88b:6131:7d71:af6e prefixlen 64
62- nd6 options=201<PERFORMNUD,DAD>
63:tap1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
64- ether 6a:37:03:8f:34:20
65- inet 172.18.0.1 netmask 0xffff0000 broadcast 172.18.255.255

$ ssh 172.18.0.1
# my own computer bash opens up
mal commented 7 years ago

That's expected; as I said 172.18.0.1 is the host machine's IP on the Docker network.

The following compose will get you where you want to go: https://gist.github.com/mal/860ed631324a77a85cd20eb8a316a7b0

Using that I'm able to find the IP of the kafka container and then open a telnet session into it:

$ docker container inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' maltmp_kafka_1
172.18.0.5
$ telnet 172.18.0.5 29092                                            
Trying 172.18.0.5...
Connected to 172.18.0.5.
Escape character is '^]'.

The host network and normal bridge networking not working on OSX is the reason for this project. Right now it's not possible to use host networking and have it work as it would on Linux.

inancgumus commented 7 years ago

Thx @mal New docker-compose.yml really works witout setting network_mode to host and extra_hosts config. That's cool.

$ docker container inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' kafkasinglenode_kafka_1
172.18.0.3
$ telnet 172.18.0.3 29092
Trying 172.18.0.3...
# it hangs here indefinitely
# timeouts afterwards

What could cause this? Maybe tap network is not configured properly?

mal commented 7 years ago

I experienced that too initially, but then it started working after about 5-7 minutes, I'm not at all familiar with kafka so I'm unsure if it just has a long start up time perhaps?

inancgumus commented 7 years ago

Yeah? Hmm, interesting. Actually, when I login to the container it was responding immediately from its container bash before. Now, even from its bash, it doesn't respond, hangs there too.

$ docker exec -ti kafkasinglenode_kafka_1 /bin/bash

root@08d61f5644bb:/# cat < /dev/tcp/127.0.0.1/29092
# hangs

root@08d61f5644bb:/# cat < /dev/tcp/kafka/29092
# hangs too

# responds to ping
root@08d61f5644bb:/# ping kafka
PING kafka (172.18.0.3): 56 data bytes
64 bytes from 172.18.0.3: icmp_seq=0 ttl=64 time=0.027 ms
64 bytes from 172.18.0.3: icmp_seq=1 ttl=64 time=0.049 ms

Could be because it binds to tcp6?

root@08d61f5644bb:/# netstat -tl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 127.0.0.11:46757        *:*                     LISTEN
tcp6       0      0 [::]:36187              [::]:*                  LISTEN
tcp6       0      0 [::]:29092              [::]:*                  LISTEN
inancgumus commented 7 years ago

It worked with netcat on the container bash, kafka responded back (saw it from its log):

$ nc -v kafka 29092
DNS fwd/rev mismatch: kafka != 08d61f5644bb
kafka [172.18.0.3] 29092 (?) open

hey
$

However, still no luck on my development machine:

$ nc -v 172.18.0.3 29092
hey
nc: connectx to 172.18.0.3 port 29092 (tcp) failed: Operation timed out

I don't know but I'm suspicious about some routing issues.

mal commented 7 years ago

Very odd, the zookeeper container responds to pings from the host almost immediately after it is started, but the kafka container doesn't start responding until it's been running around 5 minutes (sometimes a bit sooner). You're right that execing into the host allows for a much quicker connection. I'm unsure what the issue might be and unfortunately it's time for me to call it a night. Best of luck!

inancgumus commented 7 years ago

I also tried zookeeper from my machine, it also hangs too :(

$ nc -v 172.18.0.2 32181
# no response

Thanks for all your help. G'night. I believe I'm 99% close. If I can solve this, I think this will be a great Docker on OS X usecase.

murtyjones commented 6 years ago

@inancgumus were you ever able to resolve this issue? i'm seeing it as well.

mal commented 6 years ago

It may be worth starting a fresh issue for this as the existing discussion diverged pretty heavily from the initial problem report. Would you be willing to open a new issue giving more information on what steps you've taken and where it fell down please?