EnterpriseDB / barman

Barman - Backup and Recovery Manager for PostgreSQL
https://www.pgbarman.org/
GNU General Public License v3.0
2.1k stars 193 forks source link

Out of memory error on deployment with millions of files in PGDATA #300

Closed MannerMan closed 2 months ago

MannerMan commented 4 years ago

Barman version: 2.11

PostgreSQL version: 9.5

Operating system/version: CentOS 7.8

Hi,

I'm evaluating barman (barman S3 cloud backup/archiving) as a replacement for WAL-E which is the current backup system at the company I work for. We have a schema-level style of sharding our tenants, where each of our database servers hosts 4000 customer schemas, spread over 16 databases. This has worked great for scalability, but presents a challenge for many backup systems - since this layout results in a lot of files in postgresql data directory.

Output of ls -1RL /var/lib/pgsql/9.5/data/ | wc -l; 6602805

So above 6 million files, total size 77gb. Around 5 years ago we deployed WAL-E since it was one of the few backup systems that could handle so many files without a problem. However, since WAL-E is no longer maintained we're looking for alternatives. When testing barman-cloud-backup, I'm running out of memory when performing a full-backup. It seems memory simply keeps growing indefinite after starting backup. See graphs;

Screenshot from 2020-09-03 13-10-54

Server specs:

Backup command: barman-cloud-backup --endpoint-url http://svc-filestor.lab.int:9091 -j --immediate-checkpoint -J 4 s3://utv-db10/ utv-db10

Barman error:

2020-09-03 12:06:59,507 [113920] ERROR: Backup failed uploading data (MemoryError)
Segmentation fault

Kernel syslog error: Sep 3 12:06:59 utv-db10 kernel: barman-cloud-ba[113920]: segfault at 24 ip 00007f6f60b096ea sp 00007ffd9ca88820 error 6 in libpython2.7.so.1.0[7f6f60a7f000+17e000]

Target datastore is a local minio S3 instance.

So far we have tried pgbackrest and WAL-G with similar results.

martinmarques commented 2 months ago

@MannerMan I'm going over the issues that are still opened in Barman's Github. I'm closing this one because it involves unsupported platforms (centos7, postgres 9.5, barman 2.11). I would encourage to open a new issue if it's relevant to newer versions of RHEL, Postgres and Barman, with fresh data. I see from the CNPG issues that you were considering moving to backrest, which might make this message not relevant.