concourse / concourse-bosh-release

Concourse BOSH release
Apache License 2.0
28 stars 49 forks source link

Gdn assets are not updated on upgrade #128

Closed xtreme-sameer-vohra closed 3 years ago

xtreme-sameer-vohra commented 3 years ago

Bug Report

When upgrading to 6.6, folks have observed the following error

runc run: exit status 1: container_linux.go:349: starting container process caused "process_linux.go:439: container init caused \"process_linux.go:405: setting cgroup config for procHooks process caused \\\"failed to write \\\\\\\"c 5:1 rwm\\\\\\\" to \\\\\\\"/sys/fs/cgroup/devices/system.slice/concourse.service/garden/a206550f-f6dd-4609-4f13-0a11afd3fd93/devices.allow\\\\\\\": write /sys/fs/cgroup/devices/system.slice/concourse.service/garden/a206550f-f6dd-4609-4f13-0a11afd3fd93/devices.allow: operation not permitted\\\"\""

Context

BOSH does not update items under /var/gdn/assets if the directory already exists causing the old runc to stick around. That's why the BOSH recreate helped.

Workarounds

When you do the BOSH deploy you can use the --recreate flag. This will recreate everything in the deployment, and depending on the configuration of canaries can happen on a rolling basis so you shouldn't see much of a downtime. Alternatively, updating the stemcell would have the same impact.

The following can also be handy:

xtreme-sameer-vohra commented 3 years ago

cc @muntac @scottietremendous

taylorsilva commented 3 years ago

Related: https://github.com/concourse/concourse/issues/6236