Open kevinbarbour opened 4 years ago
I agree that this is unfortunate behaviour. We shall investigate.
Thanks for the thorough bug report @kevinbarbour!
I traced this down to the call to config.Config._populate_servers which is called by config.Config.server_names which is called by cli.get_server_list, itself called by cli.check.
The specific line which causes stat to be called on all backup directories is the call to config.Config._check_conflicting_paths.
The job of _check_conflicting_paths
is to check that there are no directories in the configuration which point to the same place on disk across all servers in the barman configuration. Any directories in the configuration which resolved to the same place on disk are considered unsafe and would likely render backups taken for both servers unusable, so this is an important check.
Taking a closer look at the implementation, the reason this has to touch the filesystem is because of the call to os.path.realpath which de-references symlinks to find their actual location on the filesystem. This is an important part of the conflicting paths check because if we didn't do this then the check could pass even though different servers have directories which are configured with different symlinks with resolve to the same location on disk.
So, the reason seems sensible enough however it is clearly resulting in less-than-optimal performance in your environment. I don't yet have any good ideas for improving things but will give it some thought and discuss with the team. Any ideas you have here are also welcome.
I am backing up about 50 postgres databases via barman, all on the same host. A lot of these backups are configured with a
backup_directory
located on an NFS share that can sometimes be quite slow to respond. A few of our more latency-sensitive backups (the databases with much more rapid change rates/WAL generation) are running with thebackup_directory
configured to local NVME and not touching the NFS shares at all. I've noticed an unfortunate behavior with the barman CLI commands where if you perform commands on a single server it seems to unnecessarily read every backup location configured on the server. In our configuration this means that when you runbarman check
on a server configured on local NVME it often times out because barman tries to stat the NFS-hosted directories of the 40+ other configured servers.Can be reproduced with configuration similar to the following:
Configuration excerpts: /etc/barman.d/nvme_backup.conf
/etc/barman.d/nfs_backup.conf
With the above configuration if I run
strace barman check nvme_db
I see the following in the strace output
I am not sure if this is intentional behavior, but it seems very odd that barman is performing some sort of checks on the backup directory configured for
nfs_db
when I am only checkingnvme_db
.