Closed skurfer closed 4 years ago
When you pack RelStorage, it collects the necessary information about references between objects, and stores that in database tables object_ref
and object_refs_added
. For objects that are removed because of the pack, the data in those tables is also removed, but for all objects that survive the pack, the data remains. This makes subsequent packs (substantially) faster because they don't have to rediscover references for objects that haven't changed.
In a history-free database, it's not unusual for the data needed here to rival the size of the objects themselves.
relname | pg_size_pretty |
---|---|
object_state | 13 GB |
object_ref | 4025 MB |
object_ref_pkey | 1516 MB |
object_refs_added | 1898 MB |
object_refs_added_pkey | 639 MB |
I would guess (but haven't checked) that history-preserving databases result in larger auxiliary tables, simply because they have more states to examine.
You can truncate those tables when the pack is done, at the cost have making the next pack take longer. If the database is largely objects that change and few unchanging objects, then saving the data doesn't help too much anyway.
I could see adding options to either zodbpack
or the RelStorage configuration to cause a pack operation to truncate those tables automatically.
We have a client on RelStorage 3.0.1 (backed by PostgreSQL 9.6). They were concerned about the size of their database. It seemed too large, even though they’re packing the data weekly. I’ve been looking into it and it seems like RelStorage is leaving a huge amount of unused data behind somewhere.
Data.fs
after conversionData.fs
after packing (0 days)Here’s an even stranger example (which re-enforces my thinking that something is going wrong in RelStorage):
Data.fs
after conversionI’m not too concerned about the doubling of data immediately after packing, but notice that converting a 10 GB database to file storage results in only 2.1 GB of data, so packing is removing the same data whether it’s done against file storage or RelStorage but we never see a corresponding drop in the size of the database. Also note that converting back to RelStorage doesn’t result in anything near 10 GB.
Is this a bug, or am I missing something? (I should mention that I haven’t done this dance with any other ZODBs and maybe you find this everywhere, but a 76% reduction by converting to file storage and back seems like a problem.)
Thanks.