[x] Edit deploy.rb to restart hkn-rails.service, instead of hkn-rails-migrate.service
[x] Edit logrotate systemd files on apphost to restart hkn-rails.service
[x] systemctl --user daemon-reload
[x] Merge into master
[x] Stop hkn-rails-migrate.service
[x] Delete old 2.5.0 bundler gems (~/hkn-rails/prod/shared/bundle)
[x] Deploy prod with Capistrano
[x] Start hkn-rails.service
[x] Check if working
[x] Disable hkn-rails-migrate.service
[x] Enable hkn-rails.service
Background
When the OCF upgraded its machines from Debian 8 (jessie) to 9 (stretch), it had a transition period for users on the old apphost (werewolves.ocf.berkeley.edu) to migrate their apps to the new apphost (vampires.ocf.berkeley.edu).
The idea was that we would get hkn-rails running simultaneously on both werewolves (jessie) and vampires (stretch), so when the OCF re-routed web traffic from werewolves to vampires, there would be no downtime.
When @jvperrin and I migrated hkn-rails, we created a separate capistrano target migrate, which would target the new apphost vampires in a separate deploy folder ~/hkn-rails/migrate/. (The previous deploy folder was ~/hkn-rails/prod.)
OCF setup
We implemented several workarounds in response to various issues arising from our specific setup on the OCF:
NFS (Network File System) sharing between werewolves and vampires, causing both to share the same files
(Not really related, but useful) Unix socket file binding, where traffic to hkn.eecs.berkeley.edu is routed to the program bound to the socket file /srv/apps/hkn/hkn.sock (see the apphosting docs).
Service starting / restarting management with systemd, which due to NFS also shares network files
RVM, ruby version manager, which installs and compiles Ruby on the apphost in our user directory (~hkn)
Our use of Solr, a Java indexing engine which runs as a separate subprocess from hkn-rails. We write its PID number to a file, which hkn-rails uses to know that Solr is running and which PID to connect to.
Past issues / workarounds
NFS, by itself, caused several issues:
Incompatible Ruby binaries
The same Ruby binaries were present on werewolves and vampires. Because Debian stretch upgraded various system libraries, the Ruby compiled on werewolves (2.5.0) linked to shared libraries that were not present on vampires.
Solution: we created a Git branch 'migrate', in which we edited the Gemfile ruby version from ruby: '2.5.0' to ruby: '2.5.1'. We installed Ruby 2.5.1 on vampires with rvm, and added rvm version config in the Capfile to denote which version capistrano should use when deploying.
Systemd unit file changes
The systemd unit file, which specifies the hkn-rails script to run at startup, runs only when the host is werewolves: ConditionHost: werewolves
Solution: in the migrate branch, the systemd unit file has the host changed to vampires. On the apphost, the service file (~/.config/systemd/user) has been renamed to hkn-rails-migrate.service (to avoid NFS collision with hkn-rails.service). hkn-rails.service was enabled on werewolves, and hkn-rails-migrate.service was enabled on vampires.
Solr detection failure
uh idk @jvperrin do you know how we got around this
Shared folder inconsistency
The deploy uses ~/hkn-rails/prod/shared to share files between releases, i.e. resumes, pid files, configuration. We don't want to lose access to this in the new deploy.
Solution: symlink the new shared folder to the old: ~/hkn-rails/migrate/shared -> ~/hkn-rails/prod/shared.
Current tasks
Production deployment today involves checking out the migrate git branch, then deploying to the migrate target with:
bundle exec cap migrate deploy
We would like to return to checking out the master git branch, and deploying to prod; this reduces confusion for new contributors, and reduces redundancy in our config. This will require merging all of the changes on migrate into master, as well as updating the server-side configuration through ssh:
systemd unit renamings (hkn-rails-migrate -> hkn-rails)
Double-checking shared/ folder consistency
Making sure Solr connections still work
Avoiding downtime (some will be required, to avoid simultaneous bindings to the socket file)
TODO
systemctl --user daemon-reload
~/hkn-rails/prod/shared/bundle
)Background
When the OCF upgraded its machines from Debian 8 (jessie) to 9 (stretch), it had a transition period for users on the old apphost (
werewolves.ocf.berkeley.edu
) to migrate their apps to the new apphost (vampires.ocf.berkeley.edu
).The idea was that we would get hkn-rails running simultaneously on both
werewolves
(jessie) andvampires
(stretch), so when the OCF re-routed web traffic fromwerewolves
tovampires
, there would be no downtime.When @jvperrin and I migrated hkn-rails, we created a separate capistrano target
migrate
, which would target the new apphostvampires
in a separate deploy folder~/hkn-rails/migrate/
. (The previous deploy folder was~/hkn-rails/prod
.)OCF setup
We implemented several workarounds in response to various issues arising from our specific setup on the OCF:
werewolves
andvampires
, causing both to share the same fileshkn.eecs.berkeley.edu
is routed to the program bound to the socket file/srv/apps/hkn/hkn.sock
(see the apphosting docs).systemd
, which due to NFS also shares network files~hkn
)Past issues / workarounds
NFS, by itself, caused several issues:
werewolves
andvampires
. Because Debian stretch upgraded various system libraries, the Ruby compiled onwerewolves
(2.5.0) linked to shared libraries that were not present onvampires
.Gemfile
ruby version fromruby: '2.5.0'
toruby: '2.5.1'
. We installed Ruby 2.5.1 onvampires
with rvm, and addedrvm
version config in theCapfile
to denote which versioncapistrano
should use when deploying.werewolves
:ConditionHost: werewolves
migrate
branch, the systemd unit file has the host changed tovampires
. On the apphost, the service file (~/.config/systemd/user
) has been renamed tohkn-rails-migrate.service
(to avoid NFS collision withhkn-rails.service
).hkn-rails.service
was enabled onwerewolves
, andhkn-rails-migrate.service
was enabled onvampires
.~/hkn-rails/prod/shared
to share files between releases, i.e. resumes, pid files, configuration. We don't want to lose access to this in the new deploy.~/hkn-rails/migrate/shared -> ~/hkn-rails/prod/shared
.Current tasks
Production deployment today involves checking out the
migrate
git branch, then deploying to themigrate
target with:We would like to return to checking out the
master
git branch, and deploying toprod
; this reduces confusion for new contributors, and reduces redundancy in our config. This will require merging all of the changes onmigrate
intomaster
, as well as updating the server-side configuration through ssh:hkn-rails-migrate
->hkn-rails
)shared/
folder consistency