ostreedev / ostree

Operating system and container binary deployment and upgrades
https://ostreedev.github.io/ostree/
Other
1.28k stars 295 forks source link

possible memory overuse when pruning #2483

Open dustymabe opened 2 years ago

dustymabe commented 2 years ago

Discussed this with @jlebon and we decided to open an issue here to further investigate.

I'm working on pruning our large Fedora OSTree repositories. Currently I'm running test runs and have noticed that the pruning commands can take a large amount of memory. For example, in the worst case I've seen almost 15GiB being used:

Mon Nov 15 12:24:34 PM UTC 2021
    PID   RSS COMMAND
   2795  5100 /usr/bin/python3 ./fedora-ostree-pruner --test
 552046 15482068 ostree prune --repo /mnt/fedora_koji_prod/koji/ostree/repo --only-branch fedora/35/ppc64le/testing/kinoite --refs-only --keep-younger-than=90 days ago --no-prune
$ python3 -c 'print(15482068/1024/1024)'
14.764850616455078

The repo has 18602497 total objects as reported by the --no-prune output:

021-11-15 10:40:22,806 INFO fedora-ostree-pruner - Running command: ['ostree', 'prune', '--repo', '/mnt/fedora_koji_prod/koji/ostree/repo', '--only-branch', 'fedora/35/ppc64le/testing/kinoite', '--refs-only', '--keep-younger-than=90 days ago', '--no-prune']   
Total objects: 18602497                                                                                                                                                                                                                                              
Would delete: 1536375 objects, freeing 205.3 GB

The repo also has 20231 commit objects, which might be relevant (found by running find objects/ -iname '*.commit' | wc -l).

Is this amount of memory being used expected or is there a possible overuse or memory leak?

dbnicholson commented 2 years ago

Oops, misread that and thought you were talking about deltas. That does seem like an excessive amount of memory. Possibly the object tracking is inefficient, or possibly it's leaking. I'm pretty sure the tracking is just a few hash tables, but that's a lot of objects. You could run it under valgrind or similar to see if anything comes up.

jlebon commented 2 years ago

You could run it under valgrind or similar to see if anything comes up.

Yeah, was playing with that yesterday on a (much smaller) local repo. Valgrind found no memory leaks. Putting it through massif, peak memory usage was 17M (12.7M useful and 4.3M extra). which is still suspiciously high for the number of objects (22091). The breakdown is:

74.40% (12,695,681B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->65.70% (11,210,943B) 0x5070788: g_malloc (gmem.c:106)
| ->46.58% (7,948,576B) 0x50881B4: g_slice_alloc (gslice.c:1072)
| | ->18.61% (3,174,816B) 0x50AE982: UnknownInlinedFun (gvariant-core.c:486)
| | | ->18.61% (3,174,816B) 0x50AE982: UnknownInlinedFun (gvariant-core.c:624)
| | |   ->18.61% (3,174,816B) 0x50AE982: g_variant_builder_end (gvariant.c:3725)
| | |     ->18.61% (3,174,816B) 0x50AF053: g_variant_valist_new (gvariant.c:5236)
| | |       ->18.61% (3,174,816B) 0x50AF621: g_variant_new_va (gvariant.c:5409)
| | |         ->18.61% (3,174,816B) 0x50AF773: g_variant_new (gvariant.c:5344)
| | |           ->12.39% (2,114,448B) 0x487161B: ostree_object_name_serialize (ostree-core.c:1347)
| | |           | ->06.21% (1,060,368B) 0x487ABD1: list_loose_objects_at (ostree-repo.c:3941)
| | |           | | ->06.21% (1,060,368B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| | |           | |   ->06.21% (1,060,368B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| | |           | |     ->06.21% (1,060,368B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| | |           | |       ->06.21% (1,060,368B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| | |           | |         ->06.21% (1,060,368B) 0x41A6E4: ostree_run (ot-main.c:226)
| | |           | |           ->06.21% (1,060,368B) 0x40D40A: main (main.c:143)
| | |           | |
| | |           | ->05.32% (907,344B) 0x48AA84C: traverse_iter (ostree-repo-traverse.c:468)
| | |           | | ->05.32% (907,344B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | |           | | | ->05.32% (907,344B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | |           | | |   ->05.31% (906,816B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | |           | | |   | ->05.31% (906,816B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | |           | | |   |   ->05.31% (906,720B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | |           | | |   |   | ->05.31% (906,720B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | |           | | |   |   |   ->04.80% (819,600B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | |           | | |   |   |   | ->04.80% (819,600B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | |           | | |   |   |   |   ->04.45% (759,744B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | |           | | |   |   |   |   | ->04.45% (759,744B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | |           | | |   |   |   |   |   ->02.38% (406,560B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | |           | | |   |   |   |   |   | ->02.38% (406,560B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | |           | | |   |   |   |   |   |   ->01.41% (240,816B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | |           | | |   |   |   |   |   |   | ->01.41% (240,816B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | |           | | |   |   |   |   |   |   |   ->01.28% (219,024B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | |           | | |   |   |   |   |   |   |   | ->01.28% (219,024B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | |           | | |   |   |   |   |   |   |   |   ->01.28% (219,024B) in 2 places, all below massif's threshold (1.00%)
| | |           | | |   |   |   |   |   |   |   |
| | |           | | |   |   |   |   |   |   |   ->00.13% (21,792B) in 1+ places, all below ms_print's threshold (01.00%)
| | |           | | |   |   |   |   |   |   |
| | |           | | |   |   |   |   |   |   ->00.97% (165,744B) in 1+ places, all below ms_print's threshold (01.00%)
| | |           | | |   |   |   |   |   |
| | |           | | |   |   |   |   |   ->02.07% (353,184B) 0x48AAC7C: ostree_repo_traverse_commit_union_with_parents (ostree-repo-traverse.c:616)
| | |           | | |   |   |   |   |     ->02.07% (353,184B) 0x48AAD5A: ostree_repo_traverse_commit_union (ostree-repo-traverse.c:660)
| | |           | | |   |   |   |   |       ->02.07% (353,184B) 0x48A5DC9: ostree_repo_prune (ostree-repo-prune.c:440)
| | |           | | |   |   |   |   |         ->02.07% (353,184B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| | |           | | |   |   |   |   |           ->02.07% (353,184B) 0x41A6E4: ostree_run (ot-main.c:226)
| | |           | | |   |   |   |   |             ->02.07% (353,184B) 0x40D40A: main (main.c:143)
| | |           | | |   |   |   |   |
| | |           | | |   |   |   |   ->00.35% (59,856B) in 1+ places, all below ms_print's threshold (01.00%)
| | |           | | |   |   |   |
| | |           | | |   |   |   ->00.51% (87,120B) in 1+ places, all below ms_print's threshold (01.00%)
| | |           | | |   |   |
| | |           | | |   |   ->00.00% (96B) in 1+ places, all below ms_print's threshold (01.00%)
| | |           | | |   |
| | |           | | |   ->00.00% (528B) in 1+ places, all below ms_print's threshold (01.00%)
| | |           | | |
| | |           | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| | |           | |
| | |           | ->00.86% (146,736B) in 1+ places, all below ms_print's threshold (01.00%)
| | |           |
| | |           ->06.21% (1,060,368B) 0x487ABFC: list_loose_objects_at (ostree-repo.c:3942)
| | |             ->06.21% (1,060,368B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| | |               ->06.21% (1,060,368B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| | |                 ->06.21% (1,060,368B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| | |                   ->06.21% (1,060,368B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| | |                     ->06.21% (1,060,368B) 0x41A6E4: ostree_run (ot-main.c:226)
| | |                       ->06.21% (1,060,368B) 0x40D40A: main (main.c:143)
| | |
| | ->15.51% (2,645,880B) 0x5036DD9: g_bytes_new_with_free_func (gbytes.c:186)
| | | ->10.33% (1,762,040B) 0x50B20FA: UnknownInlinedFun (gvariant-core.c:460)
| | | | ->10.33% (1,762,040B) 0x50B20FA: g_variant_ensure_serialised (gvariant-core.c:445)
| | | |   ->10.33% (1,762,040B) 0x50B2142: g_variant_get_data (gvariant-core.c:933)
| | | |     ->10.33% (1,762,040B) 0x50B3879: g_variant_get (gvariant.c:5454)
| | | |       ->10.33% (1,762,040B) 0x487166E: ostree_object_name_deserialize (ostree-core.c:1365)
| | | |         ->10.33% (1,762,040B) 0x487168E: ostree_hash_object_name (ostree-core.c:1315)
| | | |           ->05.18% (883,640B) 0x5056B9D: UnknownInlinedFun (ghash.c:472)
| | | |           | ->05.18% (883,640B) 0x5056B9D: UnknownInlinedFun (ghash.c:1598)
| | | |           |   ->05.18% (883,640B) 0x5056B9D: g_hash_table_replace (ghash.c:1657)
| | | |           |     ->05.18% (883,640B) 0x487AC1F: list_loose_objects_at (ostree-repo.c:3945)
| | | |           |       ->05.18% (883,640B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| | | |           |         ->05.18% (883,640B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| | | |           |           ->05.18% (883,640B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| | | |           |             ->05.18% (883,640B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| | | |           |               ->05.18% (883,640B) 0x41A6E4: ostree_run (ot-main.c:226)
| | | |           |                 ->05.18% (883,640B) 0x40D40A: main (main.c:143)
| | | |           |
| | | |           ->04.44% (757,440B) 0x5057197: UnknownInlinedFun (ghash.c:472)
| | | |           | ->04.44% (757,440B) 0x5057197: UnknownInlinedFun (ghash.c:1598)
| | | |           |   ->04.44% (757,440B) 0x5057197: g_hash_table_add (ghash.c:1689)
| | | |           |     ->04.43% (756,120B) 0x48AA870: traverse_iter (ostree-repo-traverse.c:470)
| | | |           |     | ->04.43% (756,120B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | | |           |     | | ->04.43% (756,120B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | | |           |     | |   ->04.43% (755,680B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | | |           |     | |   | ->04.43% (755,680B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | | |           |     | |   |   ->04.43% (755,600B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | | |           |     | |   |   | ->04.43% (755,600B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | | |           |     | |   |   |   ->04.00% (683,000B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | | |           |     | |   |   |   | ->04.00% (683,000B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | | |           |     | |   |   |   |   ->03.71% (633,120B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | | |           |     | |   |   |   |   | ->03.71% (633,120B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | | |           |     | |   |   |   |   |   ->01.99% (338,800B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | | |           |     | |   |   |   |   |   | ->01.99% (338,800B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | | |           |     | |   |   |   |   |   |   ->01.18% (200,680B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | | |           |     | |   |   |   |   |   |   | ->01.18% (200,680B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | | |           |     | |   |   |   |   |   |   |   ->01.07% (182,520B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| | | |           |     | |   |   |   |   |   |   |   | ->01.07% (182,520B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| | | |           |     | |   |   |   |   |   |   |   |   ->01.07% (182,520B) in 2 places, all below massif's threshold (1.00%)
| | | |           |     | |   |   |   |   |   |   |   |
| | | |           |     | |   |   |   |   |   |   |   ->00.11% (18,160B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |           |     | |   |   |   |   |   |   |
| | | |           |     | |   |   |   |   |   |   ->00.81% (138,120B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |           |     | |   |   |   |   |   |
| | | |           |     | |   |   |   |   |   ->01.72% (294,320B) 0x48AAC7C: ostree_repo_traverse_commit_union_with_parents (ostree-repo-traverse.c:616)
| | | |           |     | |   |   |   |   |     ->01.72% (294,320B) 0x48AAD5A: ostree_repo_traverse_commit_union (ostree-repo-traverse.c:660)
| | | |           |     | |   |   |   |   |       ->01.72% (294,320B) 0x48A5DC9: ostree_repo_prune (ostree-repo-prune.c:440)
| | | |           |     | |   |   |   |   |         ->01.72% (294,320B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| | | |           |     | |   |   |   |   |           ->01.72% (294,320B) 0x41A6E4: ostree_run (ot-main.c:226)
| | | |           |     | |   |   |   |   |             ->01.72% (294,320B) 0x40D40A: main (main.c:143)
| | | |           |     | |   |   |   |   |
| | | |           |     | |   |   |   |   ->00.29% (49,880B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |           |     | |   |   |   |
| | | |           |     | |   |   |   ->00.43% (72,600B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |           |     | |   |   |
| | | |           |     | |   |   ->00.00% (80B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |           |     | |   |
| | | |           |     | |   ->00.00% (440B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |           |     | |
| | | |           |     | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |           |     |
| | | |           |     ->00.01% (1,320B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |           |
| | | |           ->00.71% (120,960B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |
| | | ->05.18% (883,640B) 0x50A507E: UnknownInlinedFun (gvariant.c:325)
| | | | ->05.18% (883,640B) 0x50A507E: g_variant_new_boolean (gvariant.c:347)
| | | |   ->05.18% (883,640B) 0x50AF02C: g_variant_valist_new (gvariant.c:5233)
| | | |     ->05.18% (883,640B) 0x50AF621: g_variant_new_va (gvariant.c:5409)
| | | |       ->05.18% (883,640B) 0x50AF773: g_variant_new (gvariant.c:5344)
| | | |         ->05.18% (883,640B) 0x487ABFC: list_loose_objects_at (ostree-repo.c:3942)
| | | |           ->05.18% (883,640B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| | | |             ->05.18% (883,640B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| | | |               ->05.18% (883,640B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| | | |                 ->05.18% (883,640B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| | | |                   ->05.18% (883,640B) 0x41A6E4: ostree_run (ot-main.c:226)
| | | |                     ->05.18% (883,640B) 0x40D40A: main (main.c:143)
| | | |
| | | ->00.00% (200B) in 1+ places, all below ms_print's threshold (01.00%)
| | |
| | ->06.22% (1,060,608B) 0x50B1731: UnknownInlinedFun (gvariant-core.c:486)
| | | ->06.22% (1,060,608B) 0x50B1731: g_variant_new_from_bytes (gvariant-core.c:529)
| | |   ->06.21% (1,060,368B) 0x50A5095: UnknownInlinedFun (gvariant.c:326)
| | |   | ->06.21% (1,060,368B) 0x50A5095: g_variant_new_boolean (gvariant.c:347)
| | |   |   ->06.21% (1,060,368B) 0x50AF02C: g_variant_valist_new (gvariant.c:5233)
| | |   |     ->06.21% (1,060,368B) 0x50AF621: g_variant_new_va (gvariant.c:5409)
| | |   |       ->06.21% (1,060,368B) 0x50AF773: g_variant_new (gvariant.c:5344)
| | |   |         ->06.21% (1,060,368B) 0x487ABFC: list_loose_objects_at (ostree-repo.c:3942)
| | |   |           ->06.21% (1,060,368B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| | |   |             ->06.21% (1,060,368B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| | |   |               ->06.21% (1,060,368B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| | |   |                 ->06.21% (1,060,368B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| | |   |                   ->06.21% (1,060,368B) 0x41A6E4: ostree_run (ot-main.c:226)
| | |   |                     ->06.21% (1,060,368B) 0x40D40A: main (main.c:143)
| | |   |
| | |   ->00.00% (240B) in 1+ places, all below ms_print's threshold (01.00%)
| | |
| | ->06.21% (1,060,368B) 0x50A5658: UnknownInlinedFun (gvariant-core.c:486)
| | | ->06.21% (1,060,368B) 0x50A5658: UnknownInlinedFun (gvariant-core.c:624)
| | |   ->06.21% (1,060,368B) 0x50A5658: g_variant_new_strv (gvariant.c:1577)
| | |     ->06.21% (1,060,368B) 0x487ABE3: list_loose_objects_at (ostree-repo.c:3942)
| | |       ->06.21% (1,060,368B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| | |         ->06.21% (1,060,368B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| | |           ->06.21% (1,060,368B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| | |             ->06.21% (1,060,368B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| | |               ->06.21% (1,060,368B) 0x41A6E4: ostree_run (ot-main.c:226)
| | |                 ->06.21% (1,060,368B) 0x40D40A: main (main.c:143)
| | |
| | ->00.04% (6,904B) in 1+ places, all below ms_print's threshold (01.00%)
| |
| ->18.85% (3,215,723B) 0x50B20D8: UnknownInlinedFun (gvariant-core.c:455)
| | ->18.85% (3,215,723B) 0x50B20D8: g_variant_ensure_serialised (gvariant-core.c:445)
| |   ->18.85% (3,215,723B) 0x50B2142: g_variant_get_data (gvariant-core.c:933)
| |     ->18.85% (3,215,723B) 0x50B3879: g_variant_get (gvariant.c:5454)
| |       ->18.85% (3,215,723B) 0x487166E: ostree_object_name_deserialize (ostree-core.c:1365)
| |         ->18.85% (3,215,723B) 0x487168E: ostree_hash_object_name (ostree-core.c:1315)
| |           ->09.45% (1,612,643B) 0x5056B9D: UnknownInlinedFun (ghash.c:472)
| |           | ->09.45% (1,612,643B) 0x5056B9D: UnknownInlinedFun (ghash.c:1598)
| |           |   ->09.45% (1,612,643B) 0x5056B9D: g_hash_table_replace (ghash.c:1657)
| |           |     ->09.45% (1,612,643B) 0x487AC1F: list_loose_objects_at (ostree-repo.c:3945)
| |           |       ->09.45% (1,612,643B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| |           |         ->09.45% (1,612,643B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| |           |           ->09.45% (1,612,643B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| |           |             ->09.45% (1,612,643B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| |           |               ->09.45% (1,612,643B) 0x41A6E4: ostree_run (ot-main.c:226)
| |           |                 ->09.45% (1,612,643B) 0x40D40A: main (main.c:143)
| |           |
| |           ->08.10% (1,382,328B) 0x5057197: UnknownInlinedFun (ghash.c:472)
| |           | ->08.10% (1,382,328B) 0x5057197: UnknownInlinedFun (ghash.c:1598)
| |           |   ->08.10% (1,382,328B) 0x5057197: g_hash_table_add (ghash.c:1689)
| |           |     ->08.09% (1,379,919B) 0x48AA870: traverse_iter (ostree-repo-traverse.c:470)
| |           |     | ->08.09% (1,379,919B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | | ->08.09% (1,379,919B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     | |   ->08.08% (1,379,116B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | |   | ->08.08% (1,379,116B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     | |   |   ->08.08% (1,378,970B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | |   |   | ->08.08% (1,378,970B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     | |   |   |   ->07.30% (1,246,475B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | |   |   |   | ->07.30% (1,246,475B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     | |   |   |   |   ->06.77% (1,155,444B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | |   |   |   |   | ->06.77% (1,155,444B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     | |   |   |   |   |   ->03.62% (618,310B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | |   |   |   |   |   | ->03.62% (618,310B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     | |   |   |   |   |   |   ->02.15% (366,241B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | |   |   |   |   |   |   | ->02.15% (366,241B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     | |   |   |   |   |   |   |   ->01.95% (333,099B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | |   |   |   |   |   |   |   | ->01.95% (333,099B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     | |   |   |   |   |   |   |   |   ->01.40% (238,856B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | |   |   |   |   |   |   |   |   | ->01.40% (238,856B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     | |   |   |   |   |   |   |   |   |   ->01.05% (179,215B) 0x48AAC7C: ostree_repo_traverse_commit_union_with_parents (ostree-repo-traverse.c:616)
| |           |     | |   |   |   |   |   |   |   |   |   | ->01.05% (179,215B) 0x48AAD5A: ostree_repo_traverse_commit_union (ostree-repo-traverse.c:660)
| |           |     | |   |   |   |   |   |   |   |   |   |   ->01.05% (179,215B) 0x48A5DC9: ostree_repo_prune (ostree-repo-prune.c:440)
| |           |     | |   |   |   |   |   |   |   |   |   |     ->01.05% (179,215B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| |           |     | |   |   |   |   |   |   |   |   |   |
| |           |     | |   |   |   |   |   |   |   |   |   ->00.35% (59,641B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     | |   |   |   |   |   |   |   |   |
| |           |     | |   |   |   |   |   |   |   |   ->00.55% (94,243B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     | |   |   |   |   |   |   |   |
| |           |     | |   |   |   |   |   |   |   ->00.19% (33,142B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     | |   |   |   |   |   |   |
| |           |     | |   |   |   |   |   |   ->01.48% (252,069B) 0x48AAC7C: ostree_repo_traverse_commit_union_with_parents (ostree-repo-traverse.c:616)
| |           |     | |   |   |   |   |   |     ->01.48% (252,069B) 0x48AAD5A: ostree_repo_traverse_commit_union (ostree-repo-traverse.c:660)
| |           |     | |   |   |   |   |   |       ->01.48% (252,069B) 0x48A5DC9: ostree_repo_prune (ostree-repo-prune.c:440)
| |           |     | |   |   |   |   |   |         ->01.48% (252,069B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| |           |     | |   |   |   |   |   |           ->01.48% (252,069B) 0x41A6E4: ostree_run (ot-main.c:226)
| |           |     | |   |   |   |   |   |             ->01.48% (252,069B) 0x40D40A: main (main.c:143)
| |           |     | |   |   |   |   |   |
| |           |     | |   |   |   |   |   ->03.15% (537,134B) 0x48AAC7C: ostree_repo_traverse_commit_union_with_parents (ostree-repo-traverse.c:616)
| |           |     | |   |   |   |   |     ->03.15% (537,134B) 0x48AAD5A: ostree_repo_traverse_commit_union (ostree-repo-traverse.c:660)
| |           |     | |   |   |   |   |       ->03.15% (537,134B) 0x48A5DC9: ostree_repo_prune (ostree-repo-prune.c:440)
| |           |     | |   |   |   |   |         ->03.15% (537,134B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| |           |     | |   |   |   |   |           ->03.15% (537,134B) 0x41A6E4: ostree_run (ot-main.c:226)
| |           |     | |   |   |   |   |             ->03.15% (537,134B) 0x40D40A: main (main.c:143)
| |           |     | |   |   |   |   |
| |           |     | |   |   |   |   ->00.53% (91,031B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     | |   |   |   |
| |           |     | |   |   |   ->00.78% (132,495B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     | |   |   |
| |           |     | |   |   ->00.00% (146B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     | |   |
| |           |     | |   ->00.00% (803B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     | |
| |           |     | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     |
| |           |     ->00.01% (2,409B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |
| |           ->01.29% (220,022B) 0x504C2D9: UnknownInlinedFun (ghash.c:472)
| |           | ->01.29% (220,022B) 0x504C2D9: g_hash_table_lookup (ghash.c:1511)
| |           |   ->01.29% (220,022B) 0x48AA9E5: traverse_iter (ostree-repo-traverse.c:489)
| |           |     ->01.29% (219,292B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     | ->01.29% (219,292B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     |   ->01.28% (218,270B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     |   | ->01.28% (218,270B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     |   |   ->01.27% (216,080B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     |   |   | ->01.27% (216,080B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     |   |   |   ->01.17% (199,655B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |           |     |   |   |   | ->01.17% (199,655B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |           |     |   |   |   |   ->01.17% (199,655B) in 2 places, all below massif's threshold (1.00%)
| |           |     |   |   |   |
| |           |     |   |   |   ->00.10% (16,425B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     |   |   |
| |           |     |   |   ->00.01% (2,190B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     |   |
| |           |     |   ->00.01% (1,022B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |     |
| |           |     ->00.00% (730B) in 1+ places, all below ms_print's threshold (01.00%)
| |           |
| |           ->00.00% (730B) in 1+ places, all below ms_print's threshold (01.00%)
| |
| ->00.27% (46,644B) in 1+ places, all below ms_print's threshold (01.00%)
|
->08.44% (1,440,904B) 0x5070DEF: g_realloc (gmem.c:171)
| ->03.09% (526,592B) 0x5052CDC: UnknownInlinedFun (ghash.c:380)
| | ->03.09% (526,592B) 0x5052CDC: realloc_arrays (ghash.c:722)
| |   ->03.09% (526,592B) 0x50536C5: g_hash_table_resize (ghash.c:875)
| |   | ->03.09% (526,592B) 0x50569BB: UnknownInlinedFun (ghash.c:915)
| |   |   ->03.09% (526,592B) 0x50569BB: g_hash_table_insert_node (ghash.c:1341)
| |   |     ->01.54% (262,144B) 0x5056C82: UnknownInlinedFun (ghash.c:1600)
| |   |     | ->01.54% (262,144B) 0x5056C82: g_hash_table_replace (ghash.c:1657)
| |   |     |   ->01.54% (262,144B) 0x487AC1F: list_loose_objects_at (ostree-repo.c:3945)
| |   |     |     ->01.54% (262,144B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| |   |     |       ->01.54% (262,144B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| |   |     |         ->01.54% (262,144B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| |   |     |           ->01.54% (262,144B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| |   |     |             ->01.54% (262,144B) 0x41A6E4: ostree_run (ot-main.c:226)
| |   |     |               ->01.54% (262,144B) 0x40D40A: main (main.c:143)
| |   |     |
| |   |     ->01.54% (262,144B) 0x505726F: UnknownInlinedFun (ghash.c:1600)
| |   |     | ->01.54% (262,144B) 0x505726F: g_hash_table_add (ghash.c:1689)
| |   |     |   ->01.54% (262,144B) 0x48AA870: traverse_iter (ostree-repo-traverse.c:470)
| |   |     |   | ->01.54% (262,144B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |   |     |   |   ->01.54% (262,144B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |   |     |   |     ->01.54% (262,144B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |   |     |   |     | ->01.54% (262,144B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |   |     |   |     |   ->01.54% (262,144B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |   |     |   |     |     ->01.54% (262,144B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |   |     |   |     |       ->01.54% (262,144B) 0x48AA73C: traverse_dirtree (ostree-repo-traverse.c:540)
| |   |     |   |     |       | ->01.54% (262,144B) 0x48AA89F: traverse_iter (ostree-repo-traverse.c:491)
| |   |     |   |     |       |   ->01.54% (262,144B) 0x48AAC7C: ostree_repo_traverse_commit_union_with_parents (ostree-repo-traverse.c:616)
| |   |     |   |     |       |   | ->01.54% (262,144B) 0x48AAD5A: ostree_repo_traverse_commit_union (ostree-repo-traverse.c:660)
| |   |     |   |     |       |   |   ->01.54% (262,144B) 0x48A5DC9: ostree_repo_prune (ostree-repo-prune.c:440)
| |   |     |   |     |       |   |     ->01.54% (262,144B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| |   |     |   |     |       |   |       ->01.54% (262,144B) 0x41A6E4: ostree_run (ot-main.c:226)
| |   |     |   |     |       |   |         ->01.54% (262,144B) 0x40D40A: main (main.c:143)
| |   |     |   |     |       |   |
| |   |     |   |     |       |   ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |     |   |     |       |
| |   |     |   |     |       ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |     |   |     |
| |   |     |   |     ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |     |   |
| |   |     |   ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |     |
| |   |     ->00.01% (2,304B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |
| |   ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |
| ->02.07% (353,456B) 0x50AE96F: g_variant_builder_end (gvariant.c:3726)
| | ->02.07% (353,456B) 0x50AF053: g_variant_valist_new (gvariant.c:5236)
| |   ->02.07% (353,456B) 0x50AF621: g_variant_new_va (gvariant.c:5409)
| |     ->02.07% (353,456B) 0x50AF773: g_variant_new (gvariant.c:5344)
| |       ->02.07% (353,456B) 0x487ABFC: list_loose_objects_at (ostree-repo.c:3942)
| |       | ->02.07% (353,456B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| |       |   ->02.07% (353,456B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| |       |     ->02.07% (353,456B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| |       |       ->02.07% (353,456B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| |       |         ->02.07% (353,456B) 0x41A6E4: ostree_run (ot-main.c:226)
| |       |           ->02.07% (353,456B) 0x40D40A: main (main.c:143)
| |       |
| |       ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |
| ->01.55% (264,448B) 0x5052D03: UnknownInlinedFun (ghash.c:380)
| | ->01.55% (264,448B) 0x5052D03: realloc_arrays (ghash.c:727)
| |   ->01.55% (264,448B) 0x50536C5: g_hash_table_resize (ghash.c:875)
| |   | ->01.55% (264,448B) 0x50569BB: UnknownInlinedFun (ghash.c:915)
| |   |   ->01.55% (264,448B) 0x50569BB: g_hash_table_insert_node (ghash.c:1341)
| |   |     ->01.54% (262,144B) 0x5056C82: UnknownInlinedFun (ghash.c:1600)
| |   |     | ->01.54% (262,144B) 0x5056C82: g_hash_table_replace (ghash.c:1657)
| |   |     |   ->01.54% (262,144B) 0x487AC1F: list_loose_objects_at (ostree-repo.c:3945)
| |   |     |     ->01.54% (262,144B) 0x487AC9F: list_loose_objects (ostree-repo.c:3967)
| |   |     |       ->01.54% (262,144B) 0x488327E: ostree_repo_list_objects (ostree-repo.c:4867)
| |   |     |         ->01.54% (262,144B) 0x48A5C9F: ostree_repo_prune (ostree-repo-prune.c:423)
| |   |     |           ->01.54% (262,144B) 0x415B2B: ostree_builtin_prune (ot-builtin-prune.c:198)
| |   |     |             ->01.54% (262,144B) 0x41A6E4: ostree_run (ot-main.c:226)
| |   |     |               ->01.54% (262,144B) 0x40D40A: main (main.c:143)
| |   |     |
| |   |     ->00.01% (2,304B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |
| |   ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |
| ->01.54% (263,296B) 0x5052CBB: realloc_arrays (ghash.c:721)
| | ->01.54% (263,296B) 0x50536C5: g_hash_table_resize (ghash.c:875)
| | | ->01.54% (263,296B) 0x50569BB: UnknownInlinedFun (ghash.c:915)
| | |   ->01.54% (263,296B) 0x50569BB: g_hash_table_insert_node (ghash.c:1341)
| | |     ->01.54% (263,296B) in 3 places, all below massif's threshold (1.00%)
| | |
| | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |
| ->00.19% (33,112B) in 1+ places, all below ms_print's threshold (01.00%)
|
->00.26% (43,834B) in 1+ places, all below ms_print's threshold (01.00%)

So the main chunk of memory usage comes from ostree_object_name_serialize (I think the callback with ostree_object_name_deserialize is just deserializing GVariants that were created with ostree_object_name_serialize during hash table sorting).

jlebon commented 2 years ago

Yeah, was playing with that yesterday on a (much smaller) local repo. Valgrind found no memory leaks.

That siad, it's still possible that there are leaks in the repo state and workflow used in the pruner. I definitely could try to replicate things more closely, even if with much less data.

dbnicholson commented 2 years ago

It's been a bit since I looked at the prune code, but this doesn't really seem like a leak. The prune code works by creating 2 hash tables storing the serialized objects found during traversal in order to keep track of which ones are in use and whether they're reachable from commits you want to keep. The serialized object is a GVariant of form (su) where the s is the hex encoded checksum and the u is the object type. That's a inefficient as the hex encoded sha256sum is twice the size of the raw digest (64 bytes vs 32 bytes). Similarly, we don't need 32 bits to store the object type. A single byte would be more than enough. But that ship has probably sailed as the serialized object format is part of the API.

Oh, I see one dumb thing n the pruning logic where it goes through an unnecessary deserialize + serialize step, and serializing allocates memory again. But that's not a real memory hit as it's doing this one object at a time.

Unless there's a leak, I think the biggest gain would be from introducing a new more compact serialized format since you could make one that's at least half of the current format. Then the hash table sets would be much smaller. The downside of that is that deserializing (where a hex encoded checksum is expected) would require allocation, but you'd presumably only be doing that one object at a time in the prune case.

jlebon commented 2 years ago

It's been a bit since I looked at the prune code, but this doesn't really seem like a leak.

Yeah, I don't think there's any leak either (and Valgrind doesn't either). It'd be nice still to also run Valgrind against the problematic repo, but that would probably take way too long I fear. Unless we can run on an offline backup. @dustymabe Is that possible?

Unless there's a leak, I think the biggest gain would be from introducing a new more compact serialized format since you could make one that's at least half of the current format. Then the hash table sets would be much smaller.

Even cutting in half in the smaller repo tests the memory usage of those GVariants (i.e. ~11M to 5.5M), it still seems like we're using more memory than expected for that amount of objects. I need to add more instrumentation in my tests to sanity-check how many of those GVariants are live at the max snapshot. I suspect we might have multiple copies hanging around.

dustymabe commented 2 years ago

Unless we can run on an offline backup. @dustymabe Is that possible?

should be - we can probably just mount up one of the snapshots somewhere