weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 670 forks source link

sigproxy exits, but is not removed. #2349

Closed hesco closed 8 years ago

hesco commented 8 years ago
root@dessalines011:~# docker ps -a | grep Exited | grep weave | wc -l 
528

This is the one which died most recently:

6ac56dbbb744        weaveworks/weaveexec:1.5.1         "/home/weave/sigproxy"   24 minutes ago      Exited (0) 24 minutes ago                          fervent_euler

They have accumulated over 11 days since the last time I piped that output through | xargs docker rm.

rade commented 8 years ago

I think this is unavoidable; In docker run --rm ..., the removal is done by the docker client, so if, for whatever reason, the client dies or loses connectivity before the container has terminated then the container will be left around.

rade commented 8 years ago

If my theory is correct then it would be worth investigating why the docker client is dying. Is this at all reproducible by hand, i.e. by running some weave command or some docker command via the proxy?

hesco commented 8 years ago

reproduced after an upgrade to 1.5.2:

10cefe98f712        weaveworks/weaveexec:1.5.2   "/home/weave/sigproxy"   2 hours ago         Exited (0) 2 hours ago                             big_williams

as to how the network is initiated, this is the meat of the code which does the deed:

function main() {
  NETWORK=$(get_network)
  PEERS=$(get_peers)
  WEAVE_IP=$(get_weave_ip)

  /bin/echo "remove weave interfaces and network"
  remove_vethwe_interfaces_from_docker_host
  remove_weave_network

  /bin/echo "$WEAVE reset"
  $WEAVE reset 

  ver=$(weave version)
  if [ $(echo $ver | grep -q weave && echo $?) -eq 0 ];
  then
    /bin/echo "$WEAVE launch --ipalloc-range $WEAVE_SEGMENT --ipalloc-default-subnet $IPAM_SEGMENT $PEERS"
    $WEAVE launch --ipalloc-range $WEAVE_SEGMENT --ipalloc-default-subnet $IPAM_SEGMENT $PEERS
  else
    /bin/echo "$WEAVE launch $PEERS"
    $WEAVE launch $PEERS
  fi

  # http://sttts.github.io/docker/weave/mesos/2015/01/22/weave.html
  # weave create-bridge
  docker network create --driver=weavemesh --ipam-driver=weavemesh --subnet $WEAVE_SEGMENT $NETWORK_NAME 
  # echo "ip addr add dev weave $NETWORK "
  # ip addr add dev weave $NETWORK 

}
rade commented 8 years ago

Are you saying that if you run the above code in a loop then the problem arises?

From the above, it looks like you are not using the proxy, but are using the docker network plugin. Is that correct?

hesco commented 8 years ago

I do not run that code in a loop. That is the main() function of a procedural bash script which I use to reset the weave network. It tears it down and then recreates it. That function runs once then the script exits. It is my intention to use the plugin. I am not sure what the proxy does for me exactly, have no intention of using it. Still, weave status provides output indicating a proxy service running, which looks like this:

root@dessalines011:~# weave status 

        Version: 1.5.2 (up to date; next check at 2016/06/11 08:55:37)

        Service: router
       Protocol: weave 1..2
           Name: 3e:71:3f:a9:6c:4a(dessalines011)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 6
    Connections: 6 (3 established, 3 failed)
          Peers: 4 (with 12 established connections)
 TrustedSubnets: none

        Service: ipam
         Status: ready
          Range: 10.0.0.0-10.0.15.255
  DefaultSubnet: 10.0.2.0/24

        Service: dns
         Domain: weave.local.
       Upstream: 68.64.241.98, 68.64.241.102
            TTL: 1
        Entries: 0

        Service: proxy
        Address: unix:///var/run/weave/weave.sock

        Service: plugin
     DriverName: weave
rade commented 8 years ago

When do you see the stale sigproxy containers appearing? After the execution of the main() function? Or some other action? Any suggestions for how I should attempt to reproduce this problem?

rade commented 8 years ago

In the absence of steps to reproduce this I am closing this. Please re-open if/when such steps are available.