DocCyblade / tkl-mayan-edms

Turnkey Linux - Mayan EDMS
https://www.turnkeylinux.org/mayan-edms
Other
4 stars 12 forks source link

Test TKBAM #7

Closed DocCyblade closed 7 years ago

DocCyblade commented 7 years ago

Test backing up data and restore on new server

DocCyblade commented 7 years ago

@JedMeister - I will be using this for my own personal family documents. (So this is extremely important) that the hub backup, backs up all the data and can restore to another PC/cloud etc...

Since it's not a live app yet, whats the best way to test this? I know the database name and the data location.

Speaking of the location, not very sure it should stay there, as its in a strange location (/usr/share/mayan-edms/lib/python2.7/site-packages/mayan/media/document_storage/) Kind of long and deep inside the file system. We can change it. If we would does TKL or even Linux have a standard server data directory?

JedMeister commented 7 years ago

Re backup profile, check out this. Initially though I'd be inclined to just do a backup and restore it to a fresh server. I.e. create 2 fresh mayan VMs. Leave one vanilla and set the other one up with some test docs and custom config. Then do a backup and restore it to the vanilla server.

From a look at the core profile I suspect that the path of the stored docs would need to be added. Unless of course you make the default path one that is already being backed up...

Having said that though, most of the other third party upstream install type apps actually save the whole application as well. I'm not sure if that would be a good idea or not. I suspect it probably would be a good idea.

Which leads me to your other question; a directory under /srv might be the best place?! FWIW the fileserver uses /srv/storage to store files. It would be consistent with the FHS, see /srv

DocCyblade commented 7 years ago

Perfect! Thanks again. I'll post back my findings once I start testing.

DocCyblade commented 7 years ago

For the current build v0.4, all user data is stored in the following places

/srv/edms-data/document_storage (Files)
/usr/share/mayan-edms/mayan/settings/local.py  (Settings)
PostgreSQL database: mayan (Database)

These are all default locations. We could backup just these files if we want to be minimalist. If you would to a migration from one version to another this would import your data and settings, and all you would need to do is run the migrations to update your database.

The other option is to backup the app as well. This would preserve everything including customizations, and would be good for a recovery, however migrations would fail because the new system would be overwritten, say the Mayan backup is v2.1.4 and the new server has v2.2 it would wipe out the changes

Another option I have seen with some other software, I think is to install Mayan in version specific directories and database names. This would allow backing up/restore to just work, even on a newer version of the appliance as the database/data/app are different than the new. This would allow a server to be restored o a new version still running the old version of software (after a few teaks with etc files) and then a controlled upgrade/migration to the new system with our too much fuss.

The build script can be made with variables so that it would not be too hard to change the install directory and if we use version specific config files for supervisor and Nginx and link them to the config directories to change a restored system would be just to link the old config files and you should have your system running the same app version on a new base version of the appliance, with the the new version of the app still installed ready for the data to be migrated to the new app at a later time.

@JedMeister - Thoughts?

JedMeister commented 7 years ago

As a general rule, if an app uses a upstream install mechanism of some sort then we usually include the whole app in the backup. FWIW you can exclude the .pyc files; they're compiled python and will be recreated on next run if they don't exist.

In my experience, that means that the user will have the same version as their data came from (the version in the backup overwrites the version pre-installed). If the host changes too much (e.g. restoring a really old version to a new appliance) that can cause issues (e.g. the old version depended on Python 2.6, but the server has only 2.7). The user can then upgrade the software manually if they desire.

I note though, that you suggest that doing it that way may break something? Is that from experience? Or are you just guessing?

I quite like your idea of installing to a versioned directory. However in my experience, the more complex the process, the more risk there is that it will break or other cause other unintended consequences. I vote to keep it simple.

DocCyblade commented 7 years ago

Thanks for the info. I am ok with simple. I think in this case , Mayan can be upgraded with pip and a few migration commands on a restore

I have found that while getting a TKL app up and running is very easy, the upgrade has always been complicated with TKLBAM, since the new upgraded app gets overwritten. I had a fun experience upgrading redmine a few years back.

Since I'll be personally using Mayan I'll be able to make it work, and I'll be able to maybe post an upgrade guide as well.

Simple is good as it's less to do :-)

JedMeister commented 7 years ago

TKLBAM can be configured to some pretty cool stuff when migrating. The hooks mechanism can allow it to do almost anything!

The main reason why we haven't configured that to be better by default is because we haven't had the time and energy. The other reason (which relates to the first) is that currently, users will worst case get a broken system, but with all their data and a fair idea why it doesn't work.

If we had hook scripts that updated to the current version etc, then if they worked it's be awesome; but if they didn't then the system would be in an unknown state. I'd still like to do it better, but there are other fish to fry first.

Having said that, if you play with the hooks and get an awesome "migrate and update" script(s) working, then please share! :)

DocCyblade commented 7 years ago

@JedMeister - The hooks looks like exactly what I need. Even being able to stop the supervisor processes to keep the database at a known state when backing up, and also doing the same for the restore, stopping the services, performing PIP install, migration of data and start of the services.

I see there is already a shell script of an example, I just modified it and added functions for each backup/restore pre/post mostly for my uses. I posted a gist here. I'll use this as a shell to create one for Mayan that should hopefully do the following:

  1. Backup: Stop services to make sure database is in a good state
  2. Restore: Stop Services, Run PIP upgrade and update dependancies, run migration scripts, dump static files, start services back.

This should ensure that a restore will upgrade their data and upgrade the code base too since that will be stored with the backup. We just need to document somewhere that if they don't want to auto-migration to disable the hook before restore.

DocCyblade commented 7 years ago

I will test this by doing a backup, then restore a few times with data. Then I'll do a build of 1.2.0, add some data, then do a restore. It should restore my files at 2.1.0 then upgrade to 2.1.3/4, it should upgrade via PIP and do the migration as well. This would fulfill the Backup AND Migration of TKLBAM

DocCyblade commented 7 years ago

@JedMeister - I have wrote a TKLBAM hook script for Mayan EDMS that will stop/start services to keep database in a good state when backing up, and during a restore will upgrade to the version of mayan that the appliance is built on.

I had thought maybe of splitting this up into two scripts one for backup only and another for migration and just disable the migration script so the user has the option of just a restore with out upgrade. I guess I could also look for an environment variable during runtime and if upgrade is True/Enabled to proceeded. Thoughts?

JedMeister commented 7 years ago

Sorry about the delay responding. I've actually spoken with Alon about this a few days ago and have been meaning to share...

He really liked the idea of your hook scripts to update the version, but suggested that we don't enable them by default. Instead make them just output the commands (e.g. cat the hook) unless explicitly enabled. He suggested using a specific file to trigger the hook scripts. E.g. the script looks for a specific file (e.g. ~/.run-mayan-hooks or similar), if it exists (i.e. if the user has touched it) the hook is triggered to run. If the file isn't there there the hook scripts don't run.

Did you have issues with the DB being backed up? IIRC by default it should dump the DB without a hook.

Alon also said that ultimately what we probably should be doing is extending TKLBAM to manage packages from pip (as well as apt) and then only keep the data (and not the full app). But that's obviously a much bigger job so would be a future feature....

Also Alon said that backup/restore is the primary concern, with migration being a secondary consideration. Backup/restore should be flawless...

Out of interest, I note that your hook defines a specific version. What would happen if the user manually updates their server to a newer version of Mayan?

DocCyblade commented 7 years ago

The hook just stops the services so nothing is writing to the database.

I like the file idea, thank Alon for the suggestion!

As for the specific version, the idea would be that it would upgrade to the version of the appliance. If the built appliance is v2.1.4 that's the version it would upgrade to since that's the version the base system was configured and tested with. I added a variable so it can be updated via the build script.

Also I do like the idea of tklbam using PIP but that's sounds complicated :-)

DocCyblade commented 7 years ago

For first release we will go bare bones, and just backup and restore. Will use the core profile with overrides to test

DocCyblade commented 7 years ago

@JedMeister - when you have the time, make sure I have all my t's crossed and i's dotted. Just open issues with things that need fixed.

DocCyblade commented 7 years ago

I posted here because I could not create a new issue for some reason!

JedMeister commented 7 years ago

Hmm, that's weird...?! Your plans re TKLBAM sounds good to me! :+1:

DocCyblade commented 7 years ago

@JedMeister - I just downloaded and installed? build tasks repo not sure what I am doing, maybe some time we can talk and you can walk me through it and we can also catch up

JedMeister commented 7 years ago

Hey Ken,

Sorry that I haven't probably provided enough info to get you going... And I know it's not that well documented. However if you plan to continue using Proxmox then you'll want to get it working so you can build and test your own custom appliances as LXC containers! :smile:

Currently I'm totally snowed under. I'm taking next week off and have a couple of important jobs I really need to get on top of before I knock off Friday arvo.

While I'm off, if I get a chance I'll try to reply to questions etc so you aren't blocked. However for at least some of the time I won't have any internet (or even cell phone reception). Regardless I'll be back at work Tue 15th Nov. Perhaps we should have a voice chat or at least a real time text chat soon after that?!

Cheers, J

DocCyblade commented 7 years ago

Sounds good, I am currently in shake down testing of the current v0.5 and thats what all the "Test:" issues are. I hope upstream is watching and can chime in on some of them.

DocCyblade commented 7 years ago

@JedMeister - Sent you an email :-)

DocCyblade commented 7 years ago

@JedMeister - So I am back for a few days here, and the only thing I have not yet tested is backups. Also, the code base here is pretty much done. I have a POC server that I was going to use for building my production setup and I can use that for now until Mayan gets published, but until then I need to be able to start backing up to the hub.

From reading this issues comments, I think I remember that build tasks creates the profile, but apart from that I am not sure how to use build tasks correctly and how to tell build tasks what folders to backup so it is part if the standard build and not just overrides.

DocCyblade commented 7 years ago

So after a nice chat with @JedMeister, got this issue put to bed. I have submitted a PR to add the profile to official repo

JedMeister commented 7 years ago

PR merged. Thanks @DocCyblade :smile:

JedMeister commented 7 years ago

FWIW, if you comment in the PR something like:

closes https://github.com/DocCyblade/tkl-mayan-edms/issues/7

Then when the PR is merged it auto closes the referenced issue. :smile:

Note: when the PR is in the same repo as the issue, you can shothand the issue like #7 (as my link above has automatically been abbreviated to). If the PR is in a different repo to the Issue it closes (e.g. this case, or an issue on the tracker) then it still works, but you need to put the full path, e.g. https://github.com/DocCyblade/tkl-mayan-edms/issues/7