Open benjamin051000 opened 1 year ago
Looks like this is the spot where it's failing, line 94:
https://github.com/djmaze/resticker/blob/65e361de864f11ffe353dbd6aa1d253eb8ecfb7b/backup#L93-L94
1.6.0 doesn't appear to have this issue and makes it past the forget flag. I'll stay on that version until I hear about a fix
https://github.com/restic/restic/issues/3491 may be related?
Same issue here, no idea how to downgrade tho once the repo is in the newer state
Not using b2 myself. Anyone wants to try out the mazzolino/restic:latest
which now contains Restic 0.16.0?
Has there been any progress on this issue? Checking my restic logs today, I realized my backups haven't been working for several weeks and found this error in my logs. Following the comment above, I pulled version 1.7.1
and this error seems to still be there.
Forget about old snapshots based on RESTIC_FORGET_ARGS = --keep-last 10 --keep-daily 7 --keep-weekly 5 --keep-monthly 12
repo already locked, waiting up to 0s for the lock
unable to create lock in backend: repository is already locked by PID 29 on restic_server by root (UID 0, GID 0)
lock was created at 2023-12-07 01:25:45 (922h55m39.463073982s ago)
storage ID 08cb065a
the `unlock` command can be used to remove stale locks
Based on restic/restic#2736, it appears that the current guidance is to basically use the unlock
command before running other commands.
@littlegraycells Yes, manually unlocking is still the suggested advice.
To be more precise, I would suggest to always have a monitoring solution for backups (which you do not seem to have). For example by sending emails in error cases (as shown in the documentation). Or, even better, using something like Healthchecks in order to make sure failures are not being missed. (I can warmly recommend the latter one, you can also host it yourself!)
For me, having about 15 different servers, this procedure works very well.
That said, I can see that an auto-unlock solution as proposed in the restic issue could work. But that should be implemented there.
@djmaze Thanks. I do run a self-hosted version of healthchecks.io currently.
Would you recommend running the unlock
command with PRE_COMMANDS
in the backup container?
Hey @djmaze, that's an interesting comment. I use Uptime Kuma to observer services directly, as well as containers an their healthcheck status. I did not kno that healthchecks can be self-hosted!
Irrespective, what I found challenging with resticker is:
Do you have a solution to this? Cheers!
- I do not want a notification if there is just a one-time sync issue. That can happen and is not a problem.
That's what POST_COMMANDS_INCOMPLETE
is for. (Tbh, I personally do not (yet) use it because I have very few failures and it does not bug me.)
I want a warning after x consecutive unsuccessful backup attempts. I want a waning if the backup did not run for x days.
Mhh.. We could implement this in resticker. But if you are using Healthchecks, you could also solve it by just pinging Healthchecks using POST_COMMANDS_SUCCESS
and then set the healthcheck grace period to x days. So Healthchecks will notify when the grace period has been exceeded.
Would you recommend running the
unlock
command withPRE_COMMANDS
in the backup container?
If you have only one host using the repository, this might make sense, but if there is more than one (as is the case e.g. when running prunes on a bigger server, like I do) in my opinion that is too dangerous.
(I could agree with a solution which automatically removes locks that are e.g. > 24 hours old. But as I said I would prefer this to be solved upstream.)
Well currently resticker is completely unusable for many people, because of the issue detailed above. Every time it tries to backup, it goes into an infinite loop trying to lock the repo
@razaqq Well, afaics there is still no reproducible test case.
As another workaround, you could also remove RESTIC_FORGET_ARGS
and run the forget
manually at times.
I am also running into this issue.
pre_commands with restic unlock also doesn't seem to work for me.
If I unlock the repository from another machine it starts backing up again for a while only to get locked again.
See:
2024-05-02 13:58:07.091559+02:00 Checking configured repository 'rclone:google-drive:backups/restic' ...
2024-05-02 13:58:12.984656+02:00 unable to create lock in backend: repository is already locked exclusively by PID 1425 on restic-backup-custom-app-859c787754-w8n9n by root (UID 0, GID 0)
2024-05-02 13:58:12.984702+02:00lock was created at 2024-04-10 05:15:04 (536h43m8.169813168s ago)
2024-05-02 13:58:12.984710+02:00 storage ID de9659b9
2024-05-02 13:58:12.984716+02:00 the unlock
command can be used to remove stale locks
2024-05-02 13:58:12.984741+02:00 Could not access the configured repository.
2024-05-02 13:58:12.984748+02:00 Trying to initialize (in case it has not been initialized yet) ...
2024-05-02 13:58:14.908435+02:00 Fatal: create repository at rclone:google-drive:backups/restic failed: config file already exists
2024-05-02 13:58:14.908483+02:00 2024-05-02T13:58:14.908483735+02:00
2024-05-02 13:58:14.908621+02:00 Initialization failed. Please see error messages above and check your configuration. Exiting.
@thierrybla It would help if you could the original container / job that the lock came from. In your example the lock is quite old, maybe it was a prune which did not finish (because of lack of memory or similar)?
@thierrybla It would help if you could the original container / job that the lock came from. In your example the lock is quite old, maybe it was a prune which did not finish (because of lack of memory or similar)?
It should not be lack of memory I am running 128gb of RAM but its not near full at all time.
Running resticker latest, trying to backup both my docker volumes and a folder in my homedir to backblaze. I also use immich so I dump the immich db with the before command, and exclude some folders I don't want backed up.
docker-compose (running as a portainer stack):
log:
Any ideas on why it fails after backup is complete? I even see the repo in backblaze. It seems like the step it's failing on is scheduling the cron task. Any help would be greatly appreciated, thanks!