zodb / relstorage

A backend for ZODB that stores pickles in a relational database.
Other
53 stars 46 forks source link

Large PostgreSQL database with blobs #500

Open tflorac opened 9 months ago

tflorac commented 9 months ago

Hi, I've built a Pyramid "file management" application using RelStorage with a PostgreSQL back-end. The ZODB is actually storing 2 millions files (for more than 2 Tb of storage) which are stored as ZODB blobs in "shared" mode (which is not recommended anymore), using NFS to share the storage between clients (which are also using a local NFS cache). I'm thinking about switching to a new environment, using native PostgreSQL blobs, with streaming replication to several read-only servers (which is handled natively by RelStorage), but:

Best regards, Thierry

jamadden commented 9 months ago

I can only speak to my own experience and the RelStorage code.

A previous company I worked for had a similarly sized ZODB deployment with tons of blobs. We never used shared blobs because managing a separate highly-available NFS deployment was another layer of complication we didn't want to deal with. Native PG blobs and the local blob cache were plenty performant for our uses.

If you have any concurrent write activity at all, shared blobs absolutely kill RelStorage/PG performance by essentially eliminating concurrent commits.

I never dealt with trying to backup a large shared blob deployment so I have no recommendations (other than "don't use shared blobs" 😄 )

tflorac commented 9 months ago

Thank for your reply ! Backing up shared blobs is not a big problem, as it's just a basic filesystem backup (even if you have to make this backup synchronized with your database). The problem is to make an online backup of a very large PostgreSQL database, without doing a pg_dump which would require double storage space...

mamico commented 7 months ago

The problem is to make an online backup of a very large PostgreSQL database, without doing a pg_dump which would require double storage space...

@tflorac My 5 cents, probably you should look for a system backup solution (raw storage backup), rather than a logical one (sql dump). In the lands of OSS a solution might be https://github.com/pgbackrest/pgbackrest.