Open frederikmoellers opened 5 years ago
According to the logs, there is an issue with your repo path. Maybe a drive not mounted? Does it work with the CLI directly? (see the logs for the precise arguments)
ERROR - Repository /backup/Default does not exist.
Yes, the issue path is an empty directory. I use the pre- and post-backup scripts to mount a samba share into the directory. The backup is then on the share. The pre-backup command checks if the server is available and mounts the share. The post-backup command unmounts the share.
Unfortunately, the execution of the pre- and post-backup scripts does not seem to appear in the log, so this is not visible. However, judging from the log messages, I suppose the execution order is like this:
borg create
(succeeds b/c share is mounted and repo is available)borg prune
(fails b/c share is no longer mounted and repo cannot be found)borg list
(fails for same reason)Imho the post-backup script should only be run after all backup-related commands have been executed, so after the execution of borg list
.
I just confirmed this in the code:
In src/vorta/borg/create.py#L99, the pre-backup command is executed. When the thread ends (read: after borg create
finishes), #L46 executes the post-backup command.
However, this only happens for create
, not for any of prune
or list
. In prune.py, the pre- and post-backup commands are not executed. I think it should be possible to also have them execute for these subcommands.
If anyone can confirm that this would be useful, I can make a PR. Maybe it makes sense to execute the pre-backup script in BorgThread.prepare()? That would make sure that it's executed before any repository-related task.
Having the same issue here.
If anyone can confirm that this would be useful, I can make a PR. Maybe it makes sense to execute the pre-backup script in BorgThread.prepare()? That would make sure that it's executed before any repository-related task.
Originally those pre/post-backup commands were meant to prepare the data for backup. Like e.g. Borgmatic does with their hooks.
If you need a command to even connect to the repo, I would regard it as different feature?
What are you guys using it for roughly? Probably not to prepare some data for backup (e.g. dump a database).
I use pre-backup/post-backup to mount and umount network storage. It is available only on certain networks so it can't be mounted all the time. And if it is not mounted before backup or prune jobs, it won't work.
It should run before any task that involves using the repository, and run the post-backup task afterwards.
This is quite different from the current use case, which is like a before_create
/after_create
hook. You are suggesting before/after_everything
. How would this look in the UI?
It could be either a different set of pre/post scripts or maybe a checkbox next to existing ones with text "run before/after everything" and an argument passed to the script to know what event just happened.
I guess an argument to the scripts would be easiest and least intrusive. It doesn't need any GUI changes and lets the script decide what to do depending on the context: DB dump scripts will only act on "backup" arguments, mount scripts for network shares might act on any action.
a checkbox next to existing ones with text "run before/after everything" and an argument passed to the script to know what event just happened.
I can also live with a checkbox.
yep, I have just come across this problem - I thought it was me for ages (pruning not running because post-backup script to dismount drive had been executed beforehand).
Is there timing for this?
Is there timing for this?
Unfortunately, no. I will try to dedicate some time to this, but I can't give any estimates at the moment. If anyone is interested in working on this, feel free.
I have a PR almost ready with the functionality. One question rose, though:
Would it be useful to execute the commands on every action? This would even include running the post-backup-command after "mount", the pre-backup-command before "umount" and both commands around "version". For cases where the commands are used to e.g. mount filesystems, this doesn't make sense. borg --version
can run without access to the repos and borg mount
should not finish with unmounting the underlying filesystem (in the post-backup-command) where the repo is located (this would break the borg-mount).
The only use case I can imagine is if the commands are used to make the borg executable available in the first place. But I don't know if this is really realistic. Then again, it would be most consistent to run the commands on every action and let the scripts decide when to execute what (based on the $subcommand
environment variable that tells them what is being done).
So I can't decide between being consistent and more friendly for weird side-cases (execute always) or being more friendly to most actual use cases (execute only where it makes sense, e.g. not before/after version). I could use some opinions on this.
Would it be reasonable to consider this an "advanced user feature", with richer support for hooks, and not clutter the gui with toggle buttons that may confuse users who don't need the feature? What I mean is:
hook_script.sh pre create
and hook_script.sh post create
The reason I think multiple cases should be supported and passed to the script as an argument is because create
hooks are useful for things like snapshotting, but snapshot creation and destruction is not necessary or desired for check, list, nor prune. I'm also not sure if Apple has completely walled-off APFS snapshot support and if tmutil
(Time Machine infrastructure) is the only way to interact with them. At any rate, that's a tangential issue to this one; although, it introduces the question: what if the source paths don't have stable names? Time Machine snapshots have names like com.apple.TimeMachine.2019-02-23-102421
, where 102421 appears to be some kind of transaction ID. Of course, if Apple users don't have the ability to snapshot a consistent database state (eg: all those Photos and Music databases) then we can "Har! Har! Linux is better", but that doesn't seem very nice to me ;-) Anyways, hypothetically, it seems like hook_script.sh pre create
could return a path to com.apple.TimeMachine.2019-02-23-102421
...but I suspect @m3nu might say that supporting this case is outside the scope of Vorta.
@frederikmoellers
Would it be useful to execute the commands on every action? This would even include running the post-backup-command after "mount", the pre-backup-command before "umount" and both commands around "version". For cases where the commands are used to e.g. mount filesystems, this doesn't make sense.
borg --version
can run without access to the repos andborg mount
should not finish with unmounting the underlying filesystem (in the post-backup-command) where the repo is located (this would break the borg-mount).The only use case I can imagine is if the commands are used to make the borg executable available in the first place. But I don't know if this is really realistic. Then again, it would be most consistent to run the commands on every action and let the scripts decide when to execute what (based on the
$subcommand
environment variable that tells them what is being done).I'm also not sure how often this ability would be used, but it seems like users who installed Vorta from PyPI might use it for things like activating the virtualenvironment. So I can't decide between being consistent and more friendly for weird side-cases (execute always) or being more friendly to most actual use cases (execute only where it makes sense, e.g. not before/after version). I could use some opinions on this.
The included example script could handle the borg --version
pre and post cases using the *
case, which should be a noop. Users who want to do something exotic can write a function or copy their code into the *
case, or define a version
case. To be extra user-friendly I guess the example script could provide a case for borg --version
, or may just a comment. There's also the question of support-burden for more cases. For future compatibility reference during upgrades it might also be nice to see a quick-reference list of supported Vorta operations/cases near the top of the file. In a way it's the eternal question of "how much can we expect others to infer meaning" vs "explicit comments and documentation".
Other than that, I wonder about error handling, and what Vorta (and the example script) should do if a single command in the called case fails. Oh, and finally, logging considerations! It seems to me that each command in the example script should be logged by Vorta. @m3nu, what do have to say about design considerations? The only other things I can think of is that the example script should be in POSIX SH and not BASH, and that it should be written knowing that many users will be running it suid root or with setcap (to get a subset of CAP_ADMIN). Oh and there's also the question of what should happen if a user's hook script doesn't have a *
case and doesn't handle a case (for whatever reasons). Should it error, warn, or silently noop?
And @frederikmoellers, thank you for working on this! P.S. I was thinking about working on this post-Debian 11 release (probably September'ish), but I'm happy to hear someone else has prioritised it :-)
Wow, that's a lot of (unexpected but very welcome) feedback :) Thanks a lot for giving this so much thought!
I really like the idea of having a simple UI with only one input box and leaving the configuration in the script to be done by advanced users. This gives us all the flexibility we might ever need without (as you say) cluttering the UI for regular users.
Anyways, hypothetically, it seems like hook_script.sh pre create could return a path to com.apple.TimeMachine.2019-02-23-102421
That's an interesting use case. This approach certainly opens the possibility of e.g. pre-create hooks returning information which vorta/borg then uses for the backup. However, implementing this feedback channel seems like an awful lot of work and I'm not sure if there's a larger target audience for this feature. I recommend gathering some feedback before proceeding with this.
I'm also not sure how often this ability would be used, but it seems like users who installed Vorta from PyPI might use it for things like activating the virtualenvironment.
Not sure I understand what you mean. Wouldn't they have to activate the virtualenv before starting vorta in the first place? Or do you mean users who installed borg from PyPI? Anyway, I do agree power users will eventually come up with a use case for this and given your approach (script is always called with 2 parameters; default behaviour is to pass and quit), it really makes no sense to not call it on every operation.
On the question of supported cases/subcommands: I suggest we just pass the subcommand to the script the same way it appears in vorta's (GUI) log (check
/list
/prune
/create
/--version
). That way things should stay intact even if more subcommands become available. In the script I'd give examples for subcommands at the top and try to provide an extensive list, but always refer to borg and/or the vorta log for absolute certainty.
On error handling: I suggest we handle errors the same way we do now. If any script call returns with a code ≠0, we abort there and cancel the operation. Anything else is too complicated and error-prone imho. Whether the script itself aborts on every error is then left to the user. They can do elaborate error-handling or just abort on the first error as well.
I don't have a strong opinion on logging. Logging everything (stdout and stderr of the script and, consequently, of all programs called within) could clutter the log, but at the same time it would be the most user friendly option imho. So unless anyone disagree's, I'm going to prepare the PR to log everything. After all, users can always redirect to /dev/null in their script :)
I'll try to write a good example using POSIX SH and to make sure that users understand how it's going to be executed and what happens if they remove cases.
And @frederikmoellers, thank you for working on this! P.S. I was thinking about working on this post-Debian 11 release (probably September'ish), but I'm happy to hear someone else has prioritised it :-)
Well what can I say… being directly affected and needing a feature is always the best motivation ;) At the moment the need is gone for me (which is why I haven't put any pressure on finishing the PR) but with your feedback I'll try to get this done now.
Yeah, good idea to move it into scripts. Should those scripts live in the settings folder or a user-defined path?
One may want different scripts based on the profile? Or a way to temporarily disable them from the UI.
In addition to passing the subcommand as argument, it could pass more details as env vars. E.g. VORTA_PROFILE_NAME
, VORTA_REPO_URL
, ... Then the docs would have a sample script on how to deal with different subcommands?
case $1 in
pre_create)
# do stuff before borg create
;;
post_create)
# after borg create
;;
pre_prune)
# before borg prune
;;
*)
Default condition to be executed
;;
esac
Hi there,
is the thread up to date regarding this feature? I tried to check the documentation but couldn't find any mention of it.
Well the thread is up to date, but unfortunately there's no PR yet. I have to admit I haven't looked into this for quite a while since I stopped doing backups on network mounts, but I will get back to this now. You can expect a PR in a few days and there we can have a final discussion on where to put the scripts and the other open questions.
The hook names could look like the ones from borgmatic.
Yeah, good idea to move it into scripts. Should those scripts live in the settings folder or a user-defined path?
The user could specify the file path.
One may want different scripts based on the profile? Or a way to temporarily disable them from the UI.
A per profile entry would allow for multiple scripts while not forcing multiple. Some settings regarding scripts could be useful.
In addition to passing the subcommand as argument, it could pass more details as env vars. E.g.
VORTA_PROFILE_NAME
,VORTA_REPO_URL
, ... Then the docs would have a sample script on how to deal with different subcommands?
Isn't that already done for the current scripts? The hook name could also be passed as a env var so that the user can decide whether to pass it as an argument to the script.
Frederik Möllers @.***> writes:
Wow, that's a lot of (unexpected but very welcome) feedback :) Thanks a lot for giving this so much thought!
Thanks! :-) I'm happy you like this approach. Sorry I missed your reply until now!
Anyways, hypothetically, it seems like hook_script.sh pre create could return a path to com.apple.TimeMachine.2019-02-23-102421
That's an interesting use case. This approach certainly opens the possibility of e.g. pre-create hooks returning information which vorta/borg then uses for the backup. However, implementing this feedback channel seems like an awful lot of work and I'm not sure if there's a larger target audience for this feature. I recommend gathering some feedback before proceeding with this.
I'm assuming Apple's snapshot program has an interface like:
$ make_TimeMachine_snapshot SOURCE
# ... Time Machine makes snapshot, and echoes
snapshot created in com.apple.TimeMachine.2019-02-23-102421
LVM, ZFS, and btrfs can do this. Fedora, RHEL, and SUSE all default to either LVM or btrfs.
I've lost mail due to IMAP IDLE PUSH notifying an email daemon (in this case it was Akonadi) that new mail was available. Akonadi started downloading the mail, and as that was happening my scheduled backup occurred. After experiencing hardware failure the next day I tried to restore my mail from backup and learned that some was missing (database consistency issue). IIRC this is how Apple Mail works too.
The two morals of this story for me were: 1) Don't trust email storage in anything but a maildir, and/or 2) Always quiesce databases, then snapshot the filesystem before making a backup; this insures consistency.
I'm assuming that Time Machine is sane and quiesces databases controlled by Apple software, and then makes an APFS snapshot before making the Time Machine backup.
Vorta should not be inferior to Time Machine in insuring backup consistency state.
I'm also not sure how often this ability would be used, but it seems like users who installed Vorta from PyPI might use it for things like activating the virtualenvironment.
Not sure I understand what you mean. Wouldn't they have to activate the virtualenv before starting vorta in the first place? Or do you mean users who installed borg from PyPI?
Yes, thank you, that's what I meant :-)
Anyway, I do agree power users will eventually come up with a use case for this and given your approach (script is always called with 2 parameters; default behaviour is to pass and quit), it really makes no sense to not call it on every operation.
Agreed.
On the question of supported cases/subcommands: I suggest we just pass the subcommand to the script the same way it appears in vorta's (GUI) log (
check
/list
/prune
/create
/--version
). That way things should stay intact even if more subcommands become available. In the script I'd give examples for subcommands at the top and try to provide an extensive list, but always refer to borg and/or the vorta log for absolute certainty.
Agreed. Also, I wonder if the example script should ideally contain a sort of "hook API" check, because backwards and forwards compatibility are not garanteed, and users could then diff the example script between Vorta versions to see how they need to modify their script.
On error handling: I suggest we handle errors the same way we do now. If any script call returns with a code ≠0, we abort there and cancel the operation. Anything else is too complicated and error-prone imho. Whether the script itself aborts on every error is then left to the user. They can do elaborate error-handling or just abort on the first error as well.
This makes sense for errors within the script. For Borg or Vorta errors, the script needs to be able to cleanup after itself. For this the hook API needs to support "cleanup" an argument to the script.
I don't have a strong opinion on logging. Logging everything (stdout and stderr of the script and, consequently, of all programs called within) could clutter the log, but at the same time it would be the most user friendly option imho. So unless anyone disagree's, I'm going to prepare the PR to log everything. After all, users can always redirect to /dev/null in their script :)
:) Sounds good to me!
I'll try to write a good example using POSIX SH and to make sure that users understand how it's going to be executed and what happens if they remove cases.
Thank you!
In addition to passing the subcommand as argument, it could pass more details as env vars. E.g.
VORTA_PROFILE_NAME
,VORTA_REPO_URL
, ... Then the docs would have a sample script on how to deal with different subcommands?
I recommend avoiding env vars because of the potential issues they could cause with virtualenvs, Flatpaks, and Snaps. I expect that with increasingly secure namespace barriers this will one day definitely break. This method also will need to be dropped when secure sudo/doas support is (hopefully) one day added, because the normal user's env vars must not be exported into the secure superuser (or more limited CAP_ADMIN) environment.
Isn't that already done for the current scripts? The hook name could also be passed as a env var so that the user can decide whether to pass it as an argument to the script.
What do you mean?
What do you mean?
Below the pre- and post-backup command entries vorta states that the following env vars are available: $repo_url, $profile_name, $profile_slug, $returncode
.
I recommend avoiding env vars because of the potential issues they could cause with virtualenvs, Flatpaks, and Snaps. I expect that with increasingly secure namespace barriers this will one day definitely break. This method also will need to be dropped when secure sudo/doas support is (hopefully) one day added, because the normal user's env vars must not be exported into the secure superuser (or more limited CAP_ADMIN) environment.
Then we could allow the user to use placeholders in the entry where one defines the script/command.
Well the thread is up to date, but unfortunately there's no PR yet. I have to admit I haven't looked into this for quite a while since I stopped doing backups on network mounts, but I will get back to this now. You can expect a PR in a few days and there we can have a final discussion on where to put the scripts and the other open questions.
Thanks for your answer, and sorry for the delay, was on vacation. I just wanted to be sure I wasn’t missing anything on my own setup.
thank you very much !
What's the difference between Run script before and after all borg commands
and Run script before and after every borg command
?
Noticed the same. The UI is clearly not final and has logical issues.
I would look at this thread and see what use case most people want to solve. Then solve that and try to be flexible without making it too complex on the UI or the code. Maybe just 2 textboxes to run before create
or all
borg commands?
I'm glad there's already a thread for this with a lot of good discussion. I would like to add my vote to this feature!
I am not using a network share, but a USB external drive that I don't want to auto-mount at boot or login. Partly so I'm less tempted to use it for non-backup stuff, and partly so that in the unlikely event my machine gets scrambled or ransomwared, that drive and my backups should be untouched.
I also was surprised when the prune
, list
, and check
actions failed after create
finished due to the drive being unmounted already. I think the text in the GUI may need clarification, if this feature isn't going to be implemented soon. Specifically, on this Shell Commands tab, the pre- and post-backup commands talk about the "backup" while the extra arguments field talks about "borg create".
In my mind, this automatically translated into "the whole backup process" which would include prune
and anything else that happens automatically with each backup run, scheduled or manual, versus "the creation of a new snapshot". If the text around the pre- and post-backup commands also stated it only applied to borg create
then I would have been disappointed, but wouldn't have tried using them for mounting and unmounting my drive in the first place. I probably still would have ended up here requesting the feature, though :joy:
Somewhere between 0.6.10 and 0.6.22 vorta introduced a change that made sure that pre- and post-backup commands are run before Repo checks. I think this was PR #264. This enabled me to use pre- and post-backup scripts for mounting and unmounting a samba share where the actual backup is stored.
However, these commands seem to only run when an actual backup task is being executed (
borg create
). Most importantly, it seems that the post-backup command is run after the backup finished but before the pruning is done. This results in the prune command failing (repo not found) and thus no pruning happening at all.In vorta's GUI log I see that the returncodes are nonzero whenever the command tries to access the repo, but not when performing a backup via
borg create
:In the log file I see that the post-backup command seems to run directly after
create
, but beforeprune
. Furthermore, I see that all commands exceptborg create
seem to fail:I'm currently using vorta 0.6.22, but the changelog of 0.6.23 doesn't mention anything related to this.