restic / restic

Fast, secure, efficient backup program
https://restic.net
BSD 2-Clause "Simplified" License
26.2k stars 1.55k forks source link

Forget: show reason why oldest snapshot in a group is kept #4806

Closed MrJack91 closed 1 month ago

MrJack91 commented 5 months ago

Output of restic version

restic 0.16.4 compiled with go1.21.6 on linux/amd64

What backend/service did you use to store the repository?

local

Problem description / Steps to reproduce

I setup a new repo and did manually 3 snapshots. All the same day, two in the same hour (09:00).

I try to specify forget -keep-hourly 24, which should keep for the latest 24 hour max one snapshot per hour.

The forget --keep-hourly keeps too many backups. The 09:36:51 and 09:41:06 (same day) both keeps with reason hourly. But only the 09:41:06 should be kept.

$restic forget -r [path_to_repo] --prune  --keep-hourly 24  --dry-run

repository 0fc3d186 opened (version 2, compression level auto)
Applying Policy: keep 24 hourly snapshots
snapshots for (host [xxx-machine], paths [/etc, /home, /root, /usr/local, /var/spool/cron]):
keep 3 snapshots:
ID        Time                 Host         Tags        Reasons          Paths
----------------------------------------------------------------------------------------
0729c5c9  2024-05-17 09:36:51  xxx-machine              hourly snapshot  /etc
                                                                         /home
                                                                         /root
                                                                         /usr/local
                                                                         /var/spool/cron

fdee0afa  2024-05-17 09:41:06  xxx-machine              hourly snapshot  /etc
                                                                         /home
                                                                         /root
                                                                         /usr/local
                                                                         /var/spool/cron

f03c4f42  2024-05-17 11:54:19  xxx-machine              hourly snapshot  /etc
                                                                         /home
                                                                         /root
                                                                         /usr/local
                                                                         /var/spool/cron
----------------------------------------------------------------------------------------
3 snapshots

Expected behavior

The older backup in the hour 09:xx (0729c5c9 at 09:36:5) should be deleted, because in the same hour there is a newer one.

So command output should display:

$restic forget -r [path_to_repo] --prune  --keep-hourly 24  --dry-run

repository 0fc3d186 opened (version 2, compression level auto)
Applying Policy: keep 24 hourly snapshots
snapshots for (host [xxx-machine], paths [/etc, /home, /root, /usr/local, /var/spool/cron]):
keep 2 snapshots:
ID        Time                 Host         Tags        Reasons          Paths
----------------------------------------------------------------------------------------
fdee0afa  2024-05-17 09:41:06  xxx-machine              hourly snapshot  /etc
                                                                         /home
                                                                         /root
                                                                         /usr/local
                                                                         /var/spool/cron

f03c4f42  2024-05-17 11:54:19  xxx-machine              hourly snapshot  /etc
                                                                         /home
                                                                         /root
                                                                         /usr/local
                                                                         /var/spool/cron
----------------------------------------------------------------------------------------
2 snapshots

Actual behavior

The backup 0729c5c9 stays and is not forgotten by restic which is wrong in my opinion.

Do you have any idea what may have caused this?

No. I found already different problems in the forum and issues. But mostly regarding time zones or confusing between the -keep-within and -keep-* option.

Did restic help you today? Did it make you happy in any way?

Restic is awesome. Thank you very much for all your work. I used already to often to get a older config file version.

MrJack91 commented 5 months ago

Ok. It seems, that this is only not working like excepted for the initial snapshot. This initial one stays what ever happens later.

But for all other snapshots it's working like excepted.

I think it would be helpful to leave this issue to avoid initial confusion for others.

MichaelEischer commented 4 months ago

This is actually intended behavior (although a bit unexpected), see https://restic.readthedocs.io/en/stable/060_forget.html#removing-snapshots-according-to-a-policy :

If there are not enough snapshots to keep one for each duration related --keep-{within-,}* option, the oldest snapshot is kept additionally.

That special case isn't particularly relevant for --keep-hourly. However, for --keep-yearly 2 with a backup that contains only a few months of snapshot it's rather helpful to keep the oldest and the latest snapshot of this year.

MrJack91 commented 4 months ago

Oh sorry. Did not see this. Thank you very much for your answer and all your work!

Maybe to less confuse, we could add this also as a reason in the restic forget command. Then everyone would understand this.

MichaelEischer commented 4 months ago

Here you go: https://github.com/restic/restic/pull/4820

MrJack91 commented 4 months ago

@MichaelEischer: Thank you for your effort. just to be clear. I think it would be helpful to add the info about this into the forget command output (not the help).

So in the column reason where we have currently hourly snapshot there should be the additional reason: oldest-snapshot

$restic forget -r [path_to_repo] --prune  --keep-hourly 24  --dry-run

repository 0fc3d186 opened (version 2, compression level auto)
Applying Policy: keep 24 hourly snapshots
snapshots for (host [xxx-machine], paths [/etc, /home, /root, /usr/local, /var/spool/cron]):
keep 3 snapshots:
ID        Time                 Host         Tags        Reasons          Paths
----------------------------------------------------------------------------------------
0729c5c9  2024-05-17 09:36:51  xxx-machine              hourly snapshot  /etc
                                                                         /home
                                                                         /root
                                                                         /usr/local
                                                                         /var/spool/cron

fdee0afa  2024-05-17 09:41:06  xxx-machine              hourly snapshot  /etc
                                                                         /home
                                                                         /root
                                                                         /usr/local
                                                                         /var/spool/cron

That is the reason i was confused. Because i just checked this column. I guess others could be confused easily too, if this info is there missing. For me it's kind of uncomplete. If a reason is displayed, then all reason should be displayed.

MichaelEischer commented 4 months ago

Any suggestion for a compact reason that would be understandable? Would oldest hourly snapshot work maybe?

MrJack91 commented 4 months ago

To keep the terms like in the doc maybe with the term additional. e.g. additional oldest hourly snapshot.

It will just be one for each keep criteria, so at least I would not mind about a line-breaked reason.

MrJack91 commented 4 months ago

I'm wondering if the snapshot keyword is necessary in general. forget lists a list of snapshots by definition.

konidev20 commented 1 month ago

@MrJack91

Here is the list of snapshots before running prune:

konidev@lima-vm-1:~/restic$ ./restic snapshots
repository b02926f5 opened (version 2, compression level auto)
ID        Time                 Host        Tags        Paths                    Size
-----------------------------------------------------------------------------------------
44f7711e  2024-09-02 23:54:46  lima-vm-1               /home/konidev/test-data  1.024 GiB
2e4b4772  2024-09-02 23:55:47  lima-vm-1               /home/konidev/test-data  1.024 GiB
2881c5f0  2024-09-02 23:57:34  lima-vm-1               /home/konidev/test-data  2.040 GiB
b6d26123  2024-09-02 23:57:45  lima-vm-1               /home/konidev/test-data  2.040 GiB
-----------------------------------------------------------------------------------------
4 snapshots

Here is the sample output after the change:

konidev@lima-vm-1:~/restic$ ./restic forget --keep-daily 24 --dry-run
repository b02926f5 opened (version 2, compression level auto)
Applying Policy: keep 24 daily snapshots
keep 2 snapshots:
ID        Time                 Host        Tags        Reasons                Paths                    Size
----------------------------------------------------------------------------------------------------------------
44f7711e  2024-09-02 23:54:46  lima-vm-1               oldest daily snapshot  /home/konidev/test-data  1.024 GiB
b6d26123  2024-09-02 23:57:45  lima-vm-1               daily snapshot         /home/konidev/test-data  2.040 GiB
----------------------------------------------------------------------------------------------------------------
2 snapshots

remove 2 snapshots:
ID        Time                 Host        Tags        Paths                    Size
-----------------------------------------------------------------------------------------
2e4b4772  2024-09-02 23:55:47  lima-vm-1               /home/konidev/test-data  1.024 GiB
2881c5f0  2024-09-02 23:57:34  lima-vm-1               /home/konidev/test-data  2.040 GiB
-----------------------------------------------------------------------------------------
2 snapshots

Would have removed the following snapshots:
{2881c5f0 2e4b4772}

If there is consensus, I can add the word additional into the reason for consistency with the documentation. IMO I think it should be fine to leave it out, since the word oldest already conveys that it's an additional reason for being retained.

MrJack91 commented 1 month ago

@konidev20 Great, that looks good for me. Yes sure, it's ok for me, "additional" is not necessary,

Thank you very much for your work!