Closed jsievers closed 6 years ago
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/152037934
The labels on this github issue will be updated when the story is started.
our current workaround is to increase both databases.monit_timeout
of the postgres job
as well as canary_watch_time
and update_watch_time
in the update block of the concourse manifest to 20 minutes (we have a ~10GB postgres DB)
We recently upgraded the postgres release used in concourse as advertised in the concourse 3.5.0 release notes
I did read https://github.com/cloudfoundry/postgres-release/#upgrading and increased the
databases.monit_timeout
to 300 seconds.this allowed the DB upgrade to finish without a monit timeout (it took about 2 minutes) according to
/var/vcap/sys/log/postgres/postgres_ctl.log
, but still bosh deploy failed withAccording to bosh lifecycle docs, there is another timeout (probably
update_watch_time
) which is exceeded.Rather than increasing
update_watch_time
for all jobs, according to bosh lifecycle docs it seems that a pre_start script would be a better lifecycle to perform long-running tasks like a DB upgrade because it does not timeout on the bosh level.