Closed danail-branekov closed 3 years ago
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/168000336
The labels on this github issue will be updated when the story is started.
Maybe this happens if there are multiple bosh-lite deployed jobs running Garden? That would explain why it only happens in a cf deployment.
Hey @xoebus
Our deployment is a bosh-lite one indeed but it only has a single diego-cell, i.e. there is a single garden job. For the sake of the experiment we tried a local bosh-lite deployment with 2 VMS, each one of them running the garden job rootlessly in a BPM container. Deployment was successful and we verified that shared mounts work as expected. Therefore it seems that number of Garden jobs is irrelevant.
Trying to be helpful, we kept digging to get some clues on what the issue actually is via removing jobs from a diego-cell until the symptom changes. Once we got rid of rep
the issue was gone.
Some background context, on a CF diego-cell rep
and garden
share a directory (/var/vcap/data/rep/shared/garden
) where rep
puts the cell certificates and garden
bind mounts them into application containers. Prior BPM shared volumes we used to declare /var/vcap/data/rep/shared/garden
as an additional BPM volume and BOTH garden
and rep
took care to create it and make it a shared bind mount (rep, garden). Both jobs are doing the same thing because a)their start order is not guaranteed and b)the directory must exist in order both jobs to be able to start
After it became clear that introducing shared mount volumes in BPM causes the current issue, we thought that rep
and garden
should not be doing the mount themselves, instead they should delegate this to BPM (since /var/vcap/data/rep/shared/garden
is a mounted shared volume). That is why we got rid of the shared mount thing and let rep
just create the directory in its bpm-pre-start and let BPM do the rest (make it a shared mount and mount it into the garden BPM container). Unfortunately this experiment was not successful, we still kept getting the same error when mounting the groot store.
In the end we managed to come up with a minimal bosh release which mimics rep
and garden
jobs interaction described above. Note that these dummy jobs have nothing to do with neither garden, nor diego code, their names are illustrative.
The release has two jobs:
rep
which mkdirs /var/vcap/data/rep/shared/gardengarden
which:
/var/vcap/data/rep/shared/garden
as a mounted shared BPM volumeDeploying this release (via scripts/deploy-lite-rootless.sh
) reproduces the issue. What we see is that:
garden
job starts first, initially it fails, we believe because the shared mounted BPM volume directory does not exist yet (rep
is not started yet). Here is the related error in bpm.log:
{"timestamp":"2019-09-04T12:41:24.275482451Z","level":"info","source":"bpm","message":"bpm.start.acquiring-lifecycle-lock.starting","data":{"job":"garden","process":"garden","session":"1.1"}}
{"timestamp":"2019-09-04T12:41:24.275625449Z","level":"info","source":"bpm","message":"bpm.start.acquiring-lifecycle-lock.complete","data":{"job":"garden","process":"garden","session":"1.1"}}
{"timestamp":"2019-09-04T12:41:24.275641448Z","level":"info","source":"bpm","message":"bpm.start.starting","data":{"job":"garden","process":"garden","session":"1"}}
{"timestamp":"2019-09-04T12:41:24.284547020Z","level":"info","source":"bpm","message":"bpm.start.start-process.starting","data":{"job":"garden","process":"garden","session":"1.2"}}
{"timestamp":"2019-09-04T12:41:24.284653783Z","level":"info","source":"bpm","message":"bpm.start.start-process.creating-job-prerequisites","data":{"job":"garden","process":"garden","session":"1.2"}}
{"timestamp":"2019-09-04T12:41:24.285122996Z","level":"info","source":"bpm","message":"bpm.start.start-process.complete","data":{"job":"garden","process":"garden","session":"1.2"}}
{"timestamp":"2019-09-04T12:41:24.285152091Z","level":"error","source":"bpm","message":"bpm.start.failed-to-start","data":{"error":"failed to create system files: no such file or directory","job":"garden","process":"garden","session":"1"}}
{"timestamp":"2019-09-04T12:41:24.285164780Z","level":"info","source":"bpm","message":"bpm.start.complete","data":{"job":"garden","process":"garden","session":"1"}}
2. After `garden` failed to start, `rep` starts and creates the directory
3. Now `garden` job is clear to start, here is the bpm log:
{"timestamp":"2019-09-04T12:41:55.831363962Z","level":"info","source":"bpm","message":"bpm.start.acquiring-lifecycle-lock.starting","data":{"job":"garden","process":"garden","session":"1.1"}} {"timestamp":"2019-09-04T12:41:55.831463893Z","level":"info","source":"bpm","message":"bpm.start.acquiring-lifecycle-lock.complete","data":{"job":"garden","process":"garden","session":"1.1"}} {"timestamp":"2019-09-04T12:41:55.831478942Z","level":"info","source":"bpm","message":"bpm.start.starting","data":{"job":"garden","process":"garden","session":"1"}} {"timestamp":"2019-09-04T12:41:55.837603349Z","level":"info","source":"bpm","message":"bpm.start.start-process.starting","data":{"job":"garden","process":"garden","session":"1.2"}} {"timestamp":"2019-09-04T12:41:55.837698012Z","level":"info","source":"bpm","message":"bpm.start.start-process.creating-job-prerequisites","data":{"job":"garden","process":"garden","session":"1.2"}} {"timestamp":"2019-09-04T12:41:55.838227829Z","level":"info","source":"bpm","message":"bpm.start.start-process.building-spec","data":{"job":"garden","process":"garden","session":"1.2"}} {"timestamp":"2019-09-04T12:41:55.838438853Z","level":"info","source":"bpm","message":"bpm.start.start-process.creating-bundle","data":{"job":"garden","process":"garden","session":"1.2"}} {"timestamp":"2019-09-04T12:41:55.842325219Z","level":"info","source":"bpm","message":"bpm.start.start-process.running-container","data":{"job":"garden","process":"garden","session":"1.2"}} {"timestamp":"2019-09-04T12:41:55.919018004Z","level":"info","source":"bpm","message":"bpm.start.start-process.complete","data":{"job":"garden","process":"garden","session":"1.2"}} {"timestamp":"2019-09-04T12:41:55.919089961Z","level":"info","source":"bpm","message":"bpm.start.complete","data":{"job":"garden","process":"garden","session":"1"}} {"timestamp":"2019-09-04T12:41:55.919104939Z","level":"info","source":"bpm","message":"bpm.start.releasing-lifecycle-lock.starting","data":{"job":"garden","process":"garden","session":"1.3"}} {"timestamp":"2019-09-04T12:41:55.919116759Z","level":"info","source":"bpm","message":"bpm.start.releasing-lifecycle-lock.complete","data":{"job":"garden","process":"garden","session":"1.3"}}
4. However, `garden`'s start script failed when executing the [`mount command`](https://github.com/masters-of-cats/dummy-release/blob/963f33c7e76a4fecb2aee8295ee3b9c5fc56dba0/jobs/garden/templates/bin/garden_start#L22) and the error is identical to the one that we see on the CF deployment:
dummy/5ca1c1d6-7868-4670-bdb1-8e21a4333634:~# cat /var/vcap/sys/log/garden/garden.stderr.log [WARN tini (7)] Tini is not running as PID 1 and isn't registered as a child subreaper. Zombie processes will not be re-parented to Tini, so zombie reaping won't work. To fix the problem, use the -s option or set the environment variable TINI_SUBREAPER to register Tini as a child subreaper, or run Tini as PID 1. mount: /var/vcap/data/store/groot.store: failed to setup loop device: Operation not permitted
By changing the mount command in the start script to `strace mount ...` you can see that the error comes from a problem with /dev/loop[X], which seems to be the root of the problem.
Thank you for digging into this!
I'm sorry, I don't have time to look into this further right now 😞. Would you be open to submitting a PR to fix this?
Unfortunately I have no clue what the problem is, I assume it needs quite some investigation. Alas, rootless garden is being currently deprioritised, not sure whether we would be able to work on this soon :(
This issue was marked as Stale
because it has been open for 21 days without any activity. If no activity takes place in the coming 7 days it will automatically be close. To prevent this from happening remove the Stale
label or comment below.
This issue was closed because it has been labeled Stale
for 7 days without subsequent activity. Feel free to re-open this issue at any time by commenting below.
Hi BPM, Garden here.
TL;DR Rootless garden bosh-lite CF deployment fails with
cf-deployment
11.0.0 and rootless containers enabled. The failure is caused by thegarden
job failing to start. We think that this is a regression in BPM 1.1.1, hence this bugreport. Note that the failure only occurs on a new deployment, or if the diego cell is recreated during the deployment (because of i.e. a stemcell bump), or the cell is recreated after successful deployment.Long version: The error that causes the
garden
job to fails while starting is running this line in its prestart script and the error is:Looking at the grootfs code, this error happens when mounting the store backing file onto the store path via a loopback mount. If we execute and strace the mount command ourselves on the Garden BPM container on the diego cell we get the following:
The error that we get without stracing is
It is also worth noticing that catting the loop device fails:
It seems that loop devices are not permitted to the Garden BPM container root, which is quite odd provided their permissions
While trying to figure out what is going on, we came across a couple of lxd issues (issue1, issue2) which say that creating loop devices in unprivileged containers via
mknod
succeeds, but they cannot be used. As a matter of fact, garden job does create 256 loop devices in its start ctl script but it is very unlikely those lxd issues to be related because a) creating loop devices has been around for forever and it worked b) the Garden BPM container is a privileged one.Being clueless, we started with the assumption that there is a Garden regression so we started bisecting Garden git history. It turned out that the first commit reproducing the issue is ba9d17abf01165fbdcdcae460d17a427b3cfd2ca which makes additional bpm volumes shared and writable. Nevertheless, keeping this commit accountable for the failure does not make sense because it is quite old (since beginning of June, i.e. garden-runc 1.19.3 and cf-deployment 9.4.0) and rootless worked fine back then.
Therefore we started the other way round, we tried deploying cf-deployment 11.0.0 with BPM downgraded to 1.1.0 and the deployment succeeded. From our point of view this means that there is a regression in shared/writable volumes with BPM 1.1.1. Looking at the BPM commit history, we thought that the only commit that has anything to do with shared volumes is 2ab67a73b60a40788abc2b8c1455356e65704271. Indeed, deploying BPM with the previous commit checked out did not reproduce the issue.
To make things even weirder, we tried to come up with a minimal bosh-lite setup with rootless garden and bpm only where we mimic the additional bpm volumes described in cf-deployment but we could not reproduce the problem. It seems that there is something CF-specific that makes the issue manifest itself.