cloudfoundry-community / jumpbox-boshrelease

A BOSH release for jumpboxen
MIT License
22 stars 27 forks source link

docker group disappears after a short time #75

Closed dhoffi closed 4 years ago

dhoffi commented 4 years ago

Hi,

(maybe related to #68 but information there didn't help me solving the problem)

I installed docker on the jumpbox:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
groups | grep -E '\bdocker\b' > /dev/null 2>&1 || sudo groupadd docker
sudo apt update
apt-cache policy docker-ce

sudo apt --yes install docker-ce
sudo usermod -a -G docker $USER

# put docker data dir on bosh ephemeral vm disk
# as anything else is to small (<= 4GB)
# and docker images shouldn't be backuped rather than pulled again
sudo systemctl stop docker
sudo mv /var/lib/docker /var/vcap/data/docker-data
sudo ln -s /var/vcap/data/docker-data /var/lib/docker
sudo systemctl start docker

well and it works (for a short time)

then all of a sudden and reproducable the docker group gets lost:

$ cat /etc/group | grep docker
docker:x:1003:

$ groups
staff vcap bosh_sshers bosh_sudoers docker <---

waiting 2-5 minutes doin' nothing, really nothing, and then:

$ cat /etc/group | grep docker
<nothing>

$ groups
staff vcap bosh_sshers bosh_sudoers

I also removed the message-bus line as mentioned in #63 and rebooted but didn't help anything.

$ cat /var/lib/dpkg/statoverride
root crontab 2755 /usr/bin/crontab

any ideas?

jhunt commented 4 years ago

That would be these lines (https://github.com/cloudfoundry-community/jumpbox-boshrelease/blob/master/jobs/jumpbox/templates/bin/watcher#L140-L142) overwriting your user/group databases (/etc/passwd, /etc/group, /etc/shadow, and /etc/gshadow) with cached copies from the first time watcher starts up.

This was implemented on the assumption that people would rebuild their Jumpboxen with new software (provided by BOSH) rather than use system packaging tools to augment (a brittle solution for other reasons).

You are running square into that. One workaround would be to restart watcher job after installing docker:

$ sudo -i
# monit stop watcher
# watch monit summary

# apt --yes install docker-ce
# usermod -a -G docker $USER

# monit start watcher

Although I think this is yet another case of BOSH releases being a bad way to provide jumpbox-y software. I've been exploring a better way, using ephemeral, user-supplied Docker images over in https://github.com/jhunt/containers-boshrelease - particularly the jumpbox job (https://github.com/jhunt/containers-boshrelease/blob/master/jobs/jumpbox/spec). I'd be interested in your thoughts on that approach.