Closed wasiualhasib closed 3 months ago
Hello 👋
You are using the backup_method = rsync
method with reuse_backup = link
, right?
How to identify which one is incremental in size and which one is full backup?
When using rsync
method, Barman uses hard links to create file-level incremental backups.
So, you take your first backup, which is "full", then each new backup will copy only the files which have been modified since the previous backup. The files which were not modified have a simple hard-link to the corresponding file from the previous backup. That essentially means the backups of the server will share the files which were not modified between the backups.
The output of barman list-backup
command shows the total size of each backup, i.e. considering all the files which are required by the backup to be restored.
If you want to check how much of incremental size a given backup had, then you can use the barman show-backup
command.
For example, I have these 2 backups in my server:
$ barman list-backup pg17-rsync
pg17-rsync 20240611T180600 - Tue Jun 11 18:06:51 2024 - Size: 41.7 MiB - WAL Size: 0 B
pg17-rsync 20240610T182149 - Mon Jun 10 18:21:52 2024 - Size: 38.2 MiB - WAL Size: 32.0 MiB
I took an initial backup 20240610T182149
, then created a 3MB table in Postgres, and lastly took a new backup 20240611T180600
. As you can see, my last backup 20240611T180600
contains the size of the original files from the first backup + the files modified in the meantime.
I can now use barman show-backup
command to check the incremental size of each of them:
$ barman show-backup pg17-rsync 20240610T182149
Backup 20240610T182149:
Server Name : pg17-rsync
System Id : 7377136404149220174
Status : DONE
PostgreSQL Version : 170000
PGDATA directory : /var/lib/pgsql/17/data
Base backup information:
Disk usage : 22.2 MiB (38.2 MiB with WALs)
Incremental size : 22.2 MiB (-0.00%)
Timeline : 1
Begin WAL : 00000001000000000000001B
End WAL : 00000001000000000000001B
WAL number : 1
Begin time : 2024-06-10 18:21:49.970096+00:00
End time : 2024-06-10 18:21:52.601712+00:00
Copy time : 1 second
Estimated throughput : 13.8 MiB/s
Begin Offset : 96
End Offset : 344
Begin LSN : 0/1B000060
End LSN : 0/1B000158
WAL information:
No of files : 2
Disk usage : 32.0 MiB
WAL rate : 0.13/hour
Last available : 00000001000000000000001D
Catalog information:
Retention Policy : not enforced
Previous Backup : - (this is the oldest base backup)
Next Backup : 20240611T180600
$ barman show-backup pg17-rsync 20240611T180600
Backup 20240611T180600:
Server Name : pg17-rsync
System Id : 7377136404149220174
Status : DONE
PostgreSQL Version : 170000
PGDATA directory : /var/lib/pgsql/17/data
Base backup information:
Disk usage : 25.7 MiB (41.7 MiB with WALs)
Incremental size : 5.1 MiB (-80.33%)
Timeline : 1
Begin WAL : 00000001000000000000001D
End WAL : 00000001000000000000001D
WAL number : 1
Begin time : 2024-06-11 18:06:00.203819+00:00
End time : 2024-06-11 18:06:51.852541+00:00
Copy time : less than one second
Estimated throughput : 5.4 MiB/s
Begin Offset : 40
End Offset : 400
Begin LSN : 0/1D000028
End LSN : 0/1D000190
WAL information:
No of files : 0
Disk usage : 0 B
Last available : 00000001000000000000001D
Catalog information:
Retention Policy : not enforced
Previous Backup : 20240610T182149
Next Backup : - (this is the latest base backup)
So, my second backup occupies 25.7 MiB of disk if I consider the whole backup, but it introduced 5.1 MiB worth of files.
At incremental backup there is no option for compression. So if database size is 1T then I have to keep storage at least 2 times than actual size of backup. It means compression and incremental backup not possible both at the same time. To do compression we need to use backup_method=postgres
As you noted, backup compression is only supported with backup_method = postgres
. In that sense, with the current implementation you need to choose if you prefer incremental backups through rsync
method, or compressed backups through postgres
method.
Hi @barthisrael,
Yes, you are right; you explained it clearly. I have used that command to understand the incremental size. In your show-backup command: Incremental size: 5.1 MiB (-80.33%),
what is the meaning of -80.33% and how is it calculated?
As per my assumption, it is calculated like this: (5.1/25.7)-1=(-80.155%).
But it does not match exactly 80.33%, what is the meaning of the negative sign also here?
Another issue is that, if I use streaming protocol instead of barman wal archiving command at archive_command in PostgreSQL , then incremental backup does not work for me. In that case, it takes full backup instead of incremental backup though backup_method is rsync and reuse_backup=link.
Basically for full backup everyday night I call below command
barman backup --reuse-backup=off pg
And for incremental backup every 6 hour I call below command. As per configuration it should take incremental backup by default.
barman backup pg
Could you please correct me if I am wrong?
what is the meaning of -80.33% and how is it calculated?
As per my assumption, it is calculated like this: (5.1/25.7)-1=(-80.155%).
But it does not match exactly 80.33%, what is the meaning of the negative sign also here?
The value is calculated as you noted above. You can see the actual code here.
Please note that the math is performed using bytes, but the size output is shown in "human-readable" mode, so it's rounded to the nearest unit, in my case MiB.
The meaning of the negative sign is: this backup reduced the disk space usage in "x%" compared to the actual size of the backup if it were to copy all the files. In my case the incremental size was around 20%, so the backup used 80% less disk by using hard-links.
Basically for full backup everyday night I call below command barman backup --reuse-backup=off pg
And for incremental backup every 6 hour I call below command. As per configuration it should take incremental backup by default.
barman backup pg
Could you please correct me if I am wrong?
Please note that you do not need to generate "full" backups with barman backup --reuse-backup=off
.
As I mentioned earlier, the incremental backup in Barman is implemented at file-level by using hard links.
Assume you have no backups at all in your system, and you have backup_method = rsync
and reuse_backup = link
. The first backup that you take will find no base files, so it will copy all of them during the backup.
Later, when you run a new backup again, it will identify all files that were already copied by the previous backup and which have not changed in the meantime, and create a hard-link on them. For the files which changed between the first backup and the second backup, it will create these different files.
Similarly, when you run barman delete
to remove a backup, it will remove all the references to the file that it contains. Once a given file has the reference count set to 0, the file is removed from the filesystem.
I suggest you take a look at how hard-links work in Linux. That might help you to understand how Barman leverage that mechanism to provide file-level incremental backups through rsync
.
Another issue is that, if I use streaming protocol instead of barman wal archiving command at archive_command in PostgreSQL , then incremental backup does not work for me. In that case, it takes full backup instead of incremental backup though backup_method is rsync and reuse_backup=link.
Do you mean you are not able to use streaming_archiver = on
in your configuration file together with backup_method = rsync
?
In any case, please note that the base backup and WAL archiving are different things in Barman. You should be able to have backups taken through rsync
, with WALs being streamed from Postgres to Barman throughpg_receivewal
(streaming_archiver = on
).
Could you please clarify why do you think it's not working for you?
@wasiualhasib As mentioned in the other issue, I recommend sending these questions to the Barman Google group
Using list-backup command how do you identify which one is full backup or which one is incremental backup? Look at the below list of database backup. Here both backup size looks like 26.5GB but it is actually not. Actual size of Backup ID 20240611T220902 is 26.5GB and for backup ID: 20240611T221117 is 3MB. I think there need a correction.
I identified it when entered into that directory path where barman server exists : /data/barman/pg/base/ and executed du -sch in my linux machine, otherwise it is not possible using barman list-backup pg commands
I have few issue on this: