openaustralia / morph

Take the hassle out of web scraping
https://morph.io
GNU Affero General Public License v3.0
461 stars 74 forks source link

Expand morph disk allocation #1228

Open jamezpolley opened 5 years ago

jamezpolley commented 5 years ago

Morph was down for a time (see #1227) because it ran out of disk space.

Morph is currently on a Linode 32Gb; which gives us 8 cores and 640Gb disk. We only have 380Gb of that allocated, and that's 80% used.

We have plenty of space, but adding it into the VM looks like it's going to require downtime.

jamezpolley commented 5 years ago

Linode now has high-memory plans, do they offer better value?

Checks

High Memory Linode 48Gb instance only has 2 cores and 40Gb storage. Adding another 600Gb would cost $60/month for a total of $180/mo; $20/mo more than current plan for an extra 16Gb ram (which we don't need) and losing 6 cores (which we do need).

Might be worthwhile if our storage needs grow beyond the current 640Gb. Block Storage volumes aren't included in the Linode backups though; we'd have to do our own backups of anything stored there.

jamezpolley commented 5 years ago

Can we move anything off this VM?

I'm not seeing any quick/simple wins here.

jamezpolley commented 5 years ago

Okay, so looking at expanding linode disk image researches

Linode's tooling tries to be helpful; resizing a disk through the console actually does a few things in the background for the user - migrates to a new physical host if needed; runs resizefs to embiggen the FS to take up the new allocation, etc. Because of this, resizing an existing disk image requires completely shutting down the VM, and the time it takes is however long it takes to potentially migrate plus run resizefs.

Attaching a new disk is easier, but still requires a reboot: after adding a new disk, you need to edit the Configuration to tell linode where to mount the new disk (and what filesystem it has), then reboot to start using the new config.

This should have less immediate downtime as it's just a simple reboot; once the linode comes back up we will have the now disk attached and can start using it.

I've already created a new disk image and added it to the config (if we decide not to do this, we can remove it from the config, halt the VM, and delete the disk).

EDIT: Ignore this, it's not going to work, see below for further comments

Suggested steps from here: [ ] Reboot to pick up the new (raw) disk image [ ] add it as a new physical volume then create a logical volume on top of it, then mount that at a new location [ ] Migrate scraper data directories onto the new disk. Use rsync to get the bulk copied across; then script the process of checking on a scraper-by-scraper basis that it's not running, doing a new rsync to update anything that changed, checking that the scraper still isn't running, then removing the old directory and adding symlinks to the directories on the new partition.-

Relatively simple aside from the migration, but we should be able to make sure we don't migrate anything currently running (probably even making sure we only migrate things just after their last run, to minimise the chance of a conflict).

If we have issues in future, we now have LVM in place so it's easier to add in new physical volumes.

jamezpolley commented 5 years ago

@mlandauer I'll check this plan with you tomorrow.

jamezpolley commented 5 years ago

Update: Linode sent an automated email to tell us that their backup system needs direct access to the filesystem; if we use LVM they can't read the filesystem directly and so they just stop backing up.

So... using LVM seems to not be an option; or at least, it's an option that negates any benefit we get from linode backups.

Choices seem to be:

(a) stick with EXT4 disk image and get linode's backups but always have to cop a large downtime to expand the FS (b) Use LVM on the disk space that comes with the VM; we won't get linode backups, and will still need downtime (but not as much, just a reboot) each time we add in a new physical extent to grow the space, or (c) use block storage and pay (10USD/mo for 100GiB) extra; we won't get linode backups, but will have the flexibility to grow without needing downtime.

I think (C) is the best long-term option, but it requires a migration process. (A) requires a one-off downtime in order to grow the existing FS, but the extra 300ish GiB we get gives us time to work on a smooth migration.

Linode block storage (https://www.linode.com/blockstorage) by contrast is hot-pluggable and on-the-fly resizable. It also isn't backed up.

We already use duply for backing up the sqlite DBs under /var/www/shared/db/scrapers/data; so moving that onto block storage shouldn't be a concern (it means we'd still have our backups, but we lose the linode backups)

jamezpolley commented 5 years ago

After doing this, Morph ran out of disk space, so I rebooted and increased disk size. This is no longer high priority, but it still something we should think about before it recurs again

mlandauer commented 5 years ago

(c) sounds good to me! It's a fair bit of work but it gives us the most flexibility for the future and the ability to scale the disk space separately from the CPU.