RocketChat / Rocket.Chat

The communications platform that puts data protection first.
https://rocket.chat/
Other
39.96k stars 10.3k forks source link

Snap pre-refresh hook fails on backupdb due to default snap hook timeout and slow CPU #19885

Open milot-mirdita opened 3 years ago

milot-mirdita commented 3 years ago

Description:

I tried to upgrade a rocket chat server from 2.4.11 to 3.8.3. Snap would constantly fail with the following error:

- Run pre-refresh hook of "rocketchat-server" snap if present (run hook "pre-refresh":
-----
[*] Creating backup file...

<exceeded maximum runtime of 10m0s>

htop revealed that tar was called by backupdb which would eventually be killed as the CPU is quite slow and gzip took too long.

Digging in the snap source code revealed that it gets killed because the default snap timeout of 10 minutes is used: https://github.com/snapcore/snapd/blob/e96c9757cce0f552bf088ef9ed515f946bdb4849/overlord/hookstate/hookmgr.go#L461

var defaultHookTimeout = 10 * time.Minute

As a workaround I created my own modified version of backupdb that changed the tar invocation to tar -I "gzip -1" -czvf .... And mounted my version of the script into the snap:

sudo mount --bind -o nodev,ro /home/cloud/backupdb /snap/rocketchat-server/1427/bin/backupdb

Afterwards snap refresh worked without issues.

I only realized afterwards that I could have disabled the backup with snapctl and backup-on-refresh.

Steps to reproduce:

  1. have a large database to be backup-ed
  2. have a slow CPU
  3. snap refresh rocketchat-server
  4. wait for pre-refresh failure

Expected behavior:

Don't fail on pre-refresh hook

Actual behavior:

Time out after 10 minutes, refresh doesn't complete.

Server Setup Information:

TheDom42 commented 10 months ago

I experienced the same problem. Thank you for the workaround, but I would hope for an official solution.

pyrates999 commented 9 months ago

The code is still set to time out after 10 minutes when refreshing the snap. Here's how I worked around it.

backup the database manually:

  1. Stop the rocketchat-server from running sudo service snap.rocketchat-server.rocketchat-server stop
  2. Check that the mongodb is till running sudo service snap.rocketchat-server.rocketchat-mongo status | grep Active
  3. Now you can backup the database time sudo snap run rocketchat-server.backupdb
  4. Make a note of the backup file created
  5. Start the rocketchat-server sudo service snap.rocketchat-server.rocketchat-server start

If you need to restore the database:

  1. Stop the rocketchat-server from running sudo service snap.rocketchat-server.rocketchat-server stop
  2. Check that the mongodb is till running sudo service snap.rocketchat-server.rocketchat-mongo status | grep Active
  3. Copy the backup you made, example sudo cp rocketchat_backup.tgz /var/snap/rocketchat-server/common/
  4. Now you can restore the database time sudo snap run rocketchat-server.restoredb /var/snap/rocketchat-server/common/rocketchat_backup.tgz
  5. Start the rocketchat-server sudo service snap.rocketchat-server.rocketchat-server start

Now before you refresh the snap of rocketchat-server:

  1. Get the current settings: snap get rocketchat-server
  2. Set the variable backup-on-refresh to disable sudo snap set rocketchat-server backup-on-refresh=disable
  3. Verify the backup-on-refresh is now set to disable snap get rocketchat-server
  4. You may now refresh or update your version of the snap of rocketchat-server such as sudo snap refresh --channel=6.x/stable rocketchat-server
  5. When done, set the variable backup-on-refresh back to enable sudo snap set rocketchat-server backup-on-refresh=enable
  6. Verify the backup-on-refresh is now set to enable snap get rocketchat-server

If the refresh to a newer version fails, you must refresh back to the version you had installed and then restore the database for that version. Don't restore the database when you're on a newer version.