openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
http://openfoodfacts.github.io/openfoodfacts-server/
GNU Affero General Public License v3.0
652 stars 380 forks source link

Understand why zfs snapshots take so much place on products dataset #8740

Open alexgarel opened 1 year ago

alexgarel commented 1 year ago

We use a ZFS dataset to store products.

The products are indeed written as "sto" on the disk.

If we look at snapshots, they take about 200M to 400M each half hour, while one day takes about 4G and a month 11G.

As ZFS uses COW, we would expect a day snapshot to take roughly 24x half hour snapshot size and a month snapshot to be 30x a day snapshot size.

It's absolutely not the case. This may means that much more data than expected is changing on the disk in a transient way. It would be important to understand why.

On candidate is changes.sto, which, due to the sto format may change completely on each edit. But it's generally a small file (8k), and the number of edit per day (~3000) does not seem to explain this (or does it ?).

alexgarel commented 1 year ago

I attach a file of snapshot list to give an idea.

2023-07-off1-rpool-products-snapshot-list.txt

CharlesNepote commented 1 year ago
sudo nice find /rpool/off/products -mmin -30 -printf '%s %p\n' | tee test.txt

wc -l test.txt
1864 test.txt

cat test.txt | awk '{s+=$1} END {print s}'
21668105

Which means during 30 minutes 1864 files have been modified for a total size of 21,668,105 bytes (21 MB). So 200 MB for a snapshot seems very strange...

Am I at the right place?

teolemon commented 1 year ago

we get 4 to 5 k edited products a day. So either there's a significant multiple of that number of edits on the same products, or something is off.

stephanegigandet commented 1 year ago

We periodically update all products (the last .sto is overwritten) with update_all_products.pl, but the last time I did it was more than one week ago.

CharlesNepote commented 1 year ago

Some files are modified outside of users' actions (perl script computing some things?). Example https://world.openfoodfacts.org/product/9120073485809/knackige-griller : the following files related to this product have been modified without any action from the user:

/rpool/off/products/912/007/348/5809/10.sto
/rpool/off/products/912/007/348/5809/11.sto
/rpool/off/products/912/007/348/5809/12.sto
/rpool/off/products/912/007/348/5809/13.sto
/rpool/off/products/912/007/348/5809/6.sto
[...]/rpool/off/products/912/007/348/5809/7.sto
/rpool/off/products/912/007/348/5809/8.sto
/rpool/off/products/912/007/348/5809/9.sto
/rpool/off/products/912/007/348/5809/changes.sto

stat  /rpool/off/products/912/007/348/5809/10.sto
  File: /rpool/off/products/912/007/348/5809/10.sto
  Size: 9099        Blocks: 17         IO Block: 9216   regular file
Device: 2bh/43d Inode: 20390747    Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/     off)   Gid: ( 1000/     off)
Access: 2023-07-26 17:08:15.264770659 +0200
Modify: 2023-07-26 17:08:15.264770659 +0200
Change: 2023-07-26 17:08:15.264770659 +0200
 Birth: -

Could it be related to atime (ie the time when something is accessing the file, which is saved but can be disabled with noatime mount option)? https://www.unixtutorial.org/zfs-performance-basics-disable-atime/

sudo zfs get all rpool | grep time
rpool  atime                 on                     default
rpool  relatime              off                    default

We should definitely disable atime for ZFS isn't it?

alexgarel commented 1 year ago

oh yes, it might be related to atime

@cquest is not really in favor of removing it… but if it's one of the culprit (the rewrite of change.sto might also come into play) it might be a good idea to remove it. But I would not expect atime to consume that much data…

alexgarel commented 1 year ago

result of zfs diff between two snapshots zfs diff zfs-nvme/off/products@20230726-11{00,30}: 2023-07-26-zfs-diff-off-products-11h00-11h30.txt

The changes weights for 205M

On 1411 lines, there are:

alexgarel commented 1 year ago

The user removal procedure also plays a role in changing sto files.

github-actions[bot] commented 11 months ago

This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts

alexgarel commented 4 months ago

I think we can close it, the way product opener change files makes the snapshots big.

We can adjust the number of snapshot we keep on the nvme to avoid taking too much space, and rely on snapshots for the rest.