Avans-ATGM / infrastructure

ATGM Infrastructure Repository
0 stars 2 forks source link

Setup a (hot-data)backup system for student projects #25

Closed Dirowa closed 2 years ago

Dirowa commented 2 years ago

Since somethimes things go terrible wrong, due to the system, users or admins are we going to implement as discussed a backup system.

In every student project is an automated generated directory called data_storage. We can use this as line to know what to keep backed up and what not. After creating the new backup the old backup will be deleted, or shall we keep two copy's to be sure?

As we are going to put the old galaxy offline i believe we can also use that machine as place where students can put their projects.

So i suggest the following backup scheme.

Machine name Backup generated on backup location Backup start
Midgard.bioinformatics-atgm.nl galaxy.bioinformatics-atgm.nl //mount/sdb/project_backup/year/project_N/ saturday
galaxy.bioinformatics-atgm.nl asgard.bioinformatics-atgm.nl //mnt/TeacherFiles/project_backup/year/project_N/ saturday
asgard.bioinformatics-atgm.nl midgard.bioinformatics-atgm.ml //mnt/TeacherFiles/project_backup/year/Project_N/ saturday

@hexylena Do you have any opinions about this. otherwise i could just add it to the Avans role.

My idea without any testing

hexylena commented 2 years ago
set all playbook on only daily?
make a new var: date '+%a' to get which date it is in the week
perform sh script ( gunzipping, checksum5 and transfer to new host)

all of those are covered by borg. Just use borg. Life will be much easier.

export BORG_PASSPHRASE='some-password' # This *encrypts* the backup
export BORG_REPO=username@host:/path/where/backups/go # It uses SSH to back things up, so, super simple and baked in.

# From my personal backup
# It excludes some folders
# It creates a backup named `docs-<date>`
# And it backs up Documents, OtherStuff, and Pictures.
borg create \
    --verbose                                         \
    --filter AME                                      \
    --list                                            \
    --stats                                           \
    --show-rc                                         \
    --compression lz4                                 \
    --exclude-caches                                  \
    --exclude Inky-linux-x64/                         \
    ::docs-$(date "+%Y-%m-%dT%H-%M-%S") \
    Documents OtherStuff Pictures

# The next step is to prune old things.
# This keeps 7 daily snapshots, 4 weekly, and 6 monthly (so back 6 months.)
# It specifically operates on things prefixed 'docs-'
borg prune                 \
    --remote-path=borg1    \
    --list                 \
    --prefix 'docs-' \
    --show-rc              \
    --keep-daily    7      \
    --keep-weekly   4      \
    --keep-monthly  6

And if you want to see the backups:

10:01:51|[hxr@cosima:~]$ borg list
docs-2021-04-30                      Fri, 2021-04-30 12:00:25 [29660759b0beb20dfd2599b3a696cb6bc290add1238ddd5e8ff78166871fc4f4]
docs-2021-05-04                      Tue, 2021-05-04 12:00:50 [34b3565186ad94ea1db3e1bcf62f8a7b1a793fdfe26f80b722c2d4b31130e8dc]
docs-2021-08-31                      Tue, 2021-08-31 12:00:04 [d19173d9f6b4bac2980f59bce0a533cdb0fccf8a23b0ed1dfb0a57a976c5f55c]
docs-2021-09-30                      Thu, 2021-09-30 12:00:04 [da0c9cce91e9b6303dc8907dcdee6164d72634612bca1775a8234b774f7f29ae]
docs-2021-10-31                      Sun, 2021-10-31 14:50:00 [146a52fb86984401fcf368ceeaf3a50c22998e4b39bf1e93a9298710c16f8247]
docs-2021-11-30                      Tue, 2021-11-30 12:00:04 [11c3387c1955dd235e166d45e5292fb6c94bebde054f0167272b4fed74548ec0]
docs-2021-12-04                      Sat, 2021-12-04 14:12:05 [af4c7109b1f8a8be9de065dc6347833471e8d5cd1877ce57f0a93b2e03b2a7bd]
docs-2021-12-10                      Fri, 2021-12-10 12:00:09 [33628b1eb3123ea04bb1e446a520c2708a23e45eed8fa9803e198d7c95fe4126]
docs-2021-12-19                      Sun, 2021-12-19 12:00:05 [cc07e33b4c570fc554ab7bb2769705e664606cfcae32de10caad1ce0df83138f]
docs-2021-12-24T14-19-56             Fri, 2021-12-24 14:20:03 [fd73b9eecbe0b332511084f90bee2b39c9af0f42496458fc929b1e4b5d071706]
docs-2022-01-03T13-32-06             Mon, 2022-01-03 13:32:08 [fa64c8388c50ccf93d0de8abe225100cdb7d08e6b1655e3e58c627563ca9035a]
docs-2022-01-06T12-00-01             Thu, 2022-01-06 12:00:05 [bcbe81900cf9871b149073321d82b59d247a6cb9fb5073b15b1f9cd86d11c6f3]
docs-2022-01-09T12-00-50             Sun, 2022-01-09 12:00:52 [bb5fa19d372eb4fb039c8812bcfcc61cde290455631ca28450c654acff372d57]
docs-2022-01-10T12-00-50             Mon, 2022-01-10 12:00:53 [96fdf46f14f81c57771149e49561b5c514fe0cb6283be4d55fa7d74f7c8cbf17]
docs-2022-01-11T12-00-50             Tue, 2022-01-11 12:00:52 [25e5a6d9d1c1eb18da868d423f6a2ea40c8b5a2502dbfa10ec555f2816b35963]
docs-2022-01-12T12-00-50             Wed, 2022-01-12 12:00:54 [2516faa9fb218aebf714fbc812b3b4fe53ee58b083d9d8affe31df9a03402ec4]
docs-2022-01-13T12-00-50             Thu, 2022-01-13 12:00:52 [a1796a008c8c75c8295f5c88b47a00b8cae4576330c6e9265eb8307dd3f37f78]
hexylena commented 2 years ago

Here's the full script https://gist.github.com/hexylena/7c7dc2d60620527d0ee1cd1de2a5b678, I recommend wrapping it up

And then we should stick it on cron? Run on sat. am.

set all playbook on only daily?

What did this mean?

Dirowa commented 2 years ago

Ah gotcha, Then i will do that, and thankyou for the script :)

well i was thinking adding it to the avans role, and then make the playbook all.yml run only daily. https://galaxy.bioinformatics-atgm.nl/jenkins

hexylena commented 2 years ago

Ah gotcha. My only concern there is it would make the playbook run times potentially quite high, so moving that to cron could be a good fit.

Cron is kind of annoying for logging though, and preventing duplicate jobs from running, so, even more optimally it could be a systemd-timer

10:21:03|[hxr@cosima:~]3$ cat /home/hxr/.config/systemd/user/backup.service;
[Unit]
Description=Backup
Wants=backup.timer

[Service]
Type=oneshot
ExecStart=/bin/bash /home/hxr/.bin/backup.sh

[Install]
WantedBy=multi-user.target
10:21:04|[hxr@cosima:~]$ cat /home/hxr/.config/systemd/user/backup.timer
[Unit]
Description=Backup
Requires=backup.service

[Timer]
Unit=backup.service
OnCalendar=Mon..Sun 12:00
Persistent=true

[Install]
WantedBy=timers.target

then whatever is output just goes to journalctl and is easy to find

Dirowa commented 2 years ago

okay sofar done:

Todo:

Dirowa commented 2 years ago

update: scripts works 👯

image

*note, disabled laura directory due that she already put data into it

hexylena commented 2 years ago

awesome!

find out / research loggings

check journalctl

Dirowa commented 2 years ago

it took way onger then i,m proud on but the systemctl loggings is accesible as root. For some reason do i only have loggings of today and not saturday.

hexylena commented 2 years ago
# systemd-analyze calendar "sat 12:00"
  Original form: sat 12:00
Normalized form: Sat *-*-* 12:00:00
    Next elapse: Sat 2022-01-29 12:00:00 CET
       (in UTC): Sat 2022-01-29 11:00:00 UTC
       From now: 4 days left
# systemctl show student_project_backup.timer | grep -i next
NextElapseUSecRealtime=Sat 2022-01-29 12:00:00 CET
NextElapseUSecMonotonic=0
# systemctl show student_project_backup.timer | grep -i last
LastTriggerUSec=Sat 2022-01-22 12:00:05 CET
LastTriggerUSecMonotonic=0
# systemctl status student_project_backup.timer
● student_project_backup.timer - Backup
   Loaded: loaded (/etc/systemd/system/student_project_backup.timer; disabled; vendor preset: enabled)
   Active: active (waiting) since Mon 2022-01-24 11:17:30 CET; 5h 11min ago
  Trigger: Sat 2022-01-29 12:00:00 CET; 4 days left

looks ok to me

hexylena commented 2 years ago
root@asgard:/home/thor# journalctl -u student_project_backup | tail
jan 24 15:38:56 asgard.bioinformatics-atgm.nl bash[25454]: Chunk index:                   35251                35478
jan 24 15:38:56 asgard.bioinformatics-atgm.nl bash[25454]: ------------------------------------------------------------------------------
jan 24 15:38:56 asgard.bioinformatics-atgm.nl bash[25454]: terminating with success status, rc 0
jan 24 15:38:56 asgard.bioinformatics-atgm.nl bash[25454]: Mon Jan 24 15:38:56 CET 2022 Pruning repository of project-2022-Martijn
jan 24 15:38:58 asgard.bioinformatics-atgm.nl bash[25454]: Keeping archive: project-2022-Martijn-2022-01-24T15-38-53 Mon, 2022-01-24 15:38:55 [4bc3f11a59237655ec1e8201e7bf99ae5a4cdf29f051cea552925ffebf8af349]
jan 24 15:38:58 asgard.bioinformatics-atgm.nl bash[25454]: Pruning archive: project-2022-Martijn-2022-01-24T15-38-21 Mon, 2022-01-24 15:38:22 [8b707083ac7804b18bc01cffb65f2e01d18e2f44ab1ae2e765013a51c67f760d] (1/1)
jan 24 15:38:58 asgard.bioinformatics-atgm.nl bash[25454]: Keeping archive: project-2022-Martijn-2022-01-22T12-00-12 Sat, 2022-01-22 12:00:13 [372bdb9e985705c5e7e8a571e3c90b1b65b9ead77076c7e9d2eed5159ff08aa8]
jan 24 15:38:59 asgard.bioinformatics-atgm.nl bash[25454]: terminating with success status, rc 0
jan 24 15:38:59 asgard.bioinformatics-atgm.nl bash[25454]: Mon Jan 24 15:38:59 CET 2022 Backup and Prune finished successfully
jan 24 15:38:59 asgard.bioinformatics-atgm.nl systemd[1]: Started Backup.

that too

hexylena commented 2 years ago

nice!

Dirowa commented 2 years ago

then is this finished i believe.