ElektraInitiative / libelektra

Elektra serves as a universal and secure framework to access configuration settings in a global, hierarchical key database.
https://www.libelektra.org
BSD 3-Clause "New" or "Revised" License
208 stars 123 forks source link

KeySet close memory leak #3158

Closed raphi011 closed 5 years ago

raphi011 commented 5 years ago

While benchmarking the new elektrad I experienced memory leak symptoms. The memory usage would go up after creating new handles and calling kdbGet() but never down after ksClose() and kdbClose().

My first suspicion was the mmap caching - which probably also causes memory leaks because the mmaped memory is never freed - but disabling caching by building with -DPLUGINS="ALL;-cache" did not solve my problem.

Steps to Reproduce the Problem

Go >= 1.13

I've created two tests in the go-elektra repo. Both create a test keyset with 100000 keys.

  1. TestKeySetMemory creates handles and kdbGets keysets in a loop and after waiting for 1 second - immediately closes the keyset + handle again before start over.
  2. TestKeySetMemoryWithDelayedClose also creates handles and fills keysets with kdbGets - but delays closing the handle and keyset until after all 20 keysets have been loaded. This immitates the behavior of the elektrad webserver.

Both tests wait for 20 seconds after finishing to allow the tester to see the memory consumption of the test via htop or similar tools.

The first test, which immediately closes the handle and keyset retains the same memory foot print over the length of the test.

The second test that starts closing the handles and keysets only after every keyset is 'loaded' never frees any memory. Even after forcing garbage collection and waiting for 20 seconds.

At the moment I'm clueless why the behavior of these tests differ.

You can run the tests by cloning the go-elektra repo and running these two commands in the ./kdb subdirectiry:

PKG_CONFIG_PATH=<PATH TO elektra.pc FILE> go test  -run "^(TestKeySetMemory)\$"

PKG_CONFIG_PATH=<PATH TO elektra.pc FILE> go test  -run "^(TestKeySetMemoryWithDelayedClose)\$"

Expected Result

Memory gets freed after kdbClose and ksClose

Actual Result

Memory does not get freed

System Information

raphi011 commented 5 years ago

Relates to https://github.com/ElektraInitiative/libelektra/issues/3140

mpranj commented 5 years ago

It seems unlikely to me that something like that would slip by our valgrind and ASAN tests. I have been running similar benchmarks in C and I could not observe the same. Even with mmap I can't observe anything similar. I'm not ruling it out though.

Are you sure that the bindings aren't leaking somewhere?

raphi011 commented 5 years ago

Yes, I'm sure - it is possible to inspect the memory usage of a go application during runtime, these stats do NOT contain allocations by C code and they are nowhere near the total memory usage of the application / test which would mean that the bulk of the memory is allocated in C.

My suspicion is that keys are not freed because the ref counter is != 0 even though it should be 0.

mpranj commented 5 years ago

That means that the bindings call the C API in a way that triggers this memleak. Unfortunately I do not know go, so I don’t know why this happens.

Can you maybe extract a small C example which triggers the leak? I do something very similar in benchmarks/kdb.c and there is no leak there, so it needs to do exactly what your benchmark does to trigger the leak.

markus2330 commented 5 years ago

Thank you for reporting the issue!

I fully agree with @mpranj here, we need to reduce the problem to a minimal one, with multiple languages involved it is hard to find anything.

Can you maybe write a C program that reproduces the call sequence as issued from the go bindings? Once we have that, we can minimize the C program for a test case.

Btw. trying to run the command above I failed quite early: #3159

raphi011 commented 5 years ago

I will try to reproduce this in C until tomorrow

raphi011 commented 5 years ago

I successfully reproduced this issue here. Run the benchmarks/memoryleak.c benchmark and watch the memory go up while getting keysets - and NOT go down when freeing.

Don't forget to add a noticeable amount of keys in the DB .. i tested it with 100k keys.

mpranj commented 5 years ago

Thank you for creating the example! I'll take a look.

At first glance, there is exactly one direct leak, you need to free the parentKey: https://github.com/raphi011/libelektra/blob/afcbcf5c8138be8de6ba9b1a9e559bc673ff887f/benchmarks/memoryleak.c#L22

This small leak is probably not the reason for your observations.

As a side note: using the mmap cache, I notice even LESS memory consumption than without the cache. Maybe I need to add this to my thesis :smile:. (cache:~200M vs. no cache:~600M)

It seems that we are "leaking" memory via dlopen() / dlclose(). The following can be observed by using kdb but NOT using kdb-static (just an excerpt):

valgrind --leak-check=full --show-leak-kinds=all ./bin/kdb ls user
[...]
==48451== 1,192 bytes in 1 blocks are still reachable in loss record 6 of 6
==48451==    at 0x483AB1A: calloc (vg_replace_malloc.c:762)
==48451==    by 0x400BB11: _dl_new_object (in /usr/lib64/ld-2.30.so)
==48451==    by 0x400642F: _dl_map_object_from_fd (in /usr/lib64/ld-2.30.so)
==48451==    by 0x4009315: _dl_map_object (in /usr/lib64/ld-2.30.so)
==48451==    by 0x400DB24: openaux (in /usr/lib64/ld-2.30.so)
==48451==    by 0x4DFE8C8: _dl_catch_exception (in /usr/lib64/libc-2.30.so)
==48451==    by 0x400DF6A: _dl_map_object_deps (in /usr/lib64/ld-2.30.so)
==48451==    by 0x4013AC3: dl_open_worker (in /usr/lib64/ld-2.30.so)
==48451==    by 0x4DFE8C8: _dl_catch_exception (in /usr/lib64/libc-2.30.so)
==48451==    by 0x401363D: _dl_open (in /usr/lib64/ld-2.30.so)
==48451==    by 0x496139B: dlopen_doit (in /usr/lib64/libdl-2.30.so)
==48451==    by 0x4DFE8C8: _dl_catch_exception (in /usr/lib64/libc-2.30.so)
[...]

Your example opens many handles more handles, so it looks like this:

==48735== 72,704 bytes in 1 blocks are still reachable in loss record 8 of 8
==48735==    at 0x483880B: malloc (vg_replace_malloc.c:309)
==48735==    by 0x4F860A9: ??? (in /usr/lib64/libstdc++.so.6.0.27)
==48735==    by 0x400FD59: call_init.part.0 (in /usr/lib64/ld-2.30.so)
==48735==    by 0x400FE60: _dl_init (in /usr/lib64/ld-2.30.so)
==48735==    by 0x4013DBD: dl_open_worker (in /usr/lib64/ld-2.30.so)
==48735==    by 0x4A088C8: _dl_catch_exception (in /usr/lib64/libc-2.30.so)
==48735==    by 0x401363D: _dl_open (in /usr/lib64/ld-2.30.so)
==48735==    by 0x48C739B: dlopen_doit (in /usr/lib64/libdl-2.30.so)
==48735==    by 0x4A088C8: _dl_catch_exception (in /usr/lib64/libc-2.30.so)
==48735==    by 0x4A08962: _dl_catch_error (in /usr/lib64/libc-2.30.so)
==48735==    by 0x48C7B08: _dlerror_run (in /usr/lib64/libdl-2.30.so)
==48735==    by 0x48C7429: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.30.so)

In the example you're opening lots of KDB handles, which causes this effect to add up to significant memory consumption. If you open only one handle, it should be less. @raphi011 I don't know if you can use kdb-static for your benchmarks. When linking the benchmarks to the elektra-static library I can not observe the same any more:

==54836== HEAP SUMMARY:
==54836==     in use at exit: 0 bytes in 0 blocks
==54836==   total heap usage: 6,456,631 allocs, 6,456,631 frees, 272,753,180 bytes allocated
==54836== 
==54836== All heap blocks were freed -- no leaks are possible
==54836== 
==54836== For lists of detected and suppressed errors, rerun with: -s
==54836== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I'm currently not sure what we can do about the problems with dlopen().

raphi011 commented 5 years ago

This is a simplified simulation of the new elektrad server. Since we want to support multiple users configuring the system at the same time we NEED to support multiple handles (conflict handling etc).

I still think this is not the root of the problem. The problem is that Keys / KeySets are not freed. If you change the parent key to "system" which has a lot less keys you will see that memory consumption is much much lower.

raphi011 commented 5 years ago

@markus2330 what do you think?

mpranj commented 5 years ago

EDIT: forget this.

I don't think that we are leaking memory there. Let's reduce it to a smaller example, without elektra: https://gist.github.com/mpranj/bbdf00af308ed3f5b3f0f35bc832756f

I can observe the same thing you are describing with HTOP and the memory usage using the code from the gist above. I don't know, maybe the OS is not pre-emptively cleaning it up, but I think we're basically doing everything we can.

raphi011 commented 5 years ago

So you're telling me it's absolutely normal that elektra's memory consumption never shrinks?

mpranj commented 5 years ago

So you're telling me it's absolutely normal that elektra's memory consumption never shrinks?

I'm not saying that. I'm saying I observed the same regardless of elektra. I just noticed that my observation is only true when running with valgrind, so maybe valgrind is not free()ing until the end for other reasons.

raphi011 commented 5 years ago

https://stackoverflow.com/questions/40917024/memory-leak-after-using-calloc-and-free

Seems like it's because you're changing the pointer before freeing it

mpranj commented 5 years ago

Seems like it's because you're changing the pointer before freeing it

I am not doing that and valgrind shows no leak for the code in the gist. But forget the example, it is irrelevant as it only does not free immediately when run inside valgrind, otherwise the memory is free()d pretty much immediately.

mpranj commented 5 years ago

What's more relevant: I observed what you're reporting by simply using benchmark_createkeys which does not use the KDB at all. There the resources are also not free()d immediately, but valgrind shows absolutely 0 leaks. I'm baffled.

markus2330 commented 5 years ago

@markus2330 what do you think?

With KeySets alone there should definitely no leak whatsoever. (With KDB we cannot fully control everything as plugins might load other libraries that leak.)

@raphi011 can you create a PR or point me to the minimal example reproducing the problem?

raphi011 commented 5 years ago

https://github.com/raphi011/libelektra/tree/memoryleak here you go. I've extended my previous example by printing how many keys were NOT freed by ksDel because of key references > 0.

If you run the benchmark without valgrind you can see that the keys are not freed.

markus2330 commented 5 years ago

I cannot compile this repo, I get the error:

CMake Error: Error processing file: /home/markus/Projekte/Elektra/repos/libelektra/scripts/cmake/ElektraManpage.cmake
make[2]: *** [src/bindings/intercept/env/CMakeFiles/man-kdb-elektrify-getenv.dir/build.make:61: ../doc/man/man1/kdb-elektrify-getenv.1] Fehler 1
make[1]: *** [CMakeFiles/Makefile2:17236: src/bindings/intercept/env/CMakeFiles/man-kdb-elektrify-getenv.dir/all] Fehler 2

Can you rebase to the main master please?

And please make a PR, it is much easier to see the changes, then.

markus2330 commented 5 years ago

If you run the benchmark without valgrind you can see that the keys are not freed.

Can you copy the output of a run here?

markus2330 commented 5 years ago

I observed what you're reporting by simply using benchmark_createkeys which does not use the KDB at all. There the resources are also not free()d immediately, but valgrind shows absolutely 0 leaks. I'm baffled.

I think we should first follow this trace, as it is much easier to understand if we only have KeySets.

Maybe we added a wrong suppression to valgrind?

markus2330 commented 5 years ago

As a side note: using the mmap cache, I notice even LESS memory consumption than without the cache. Maybe I need to add this to my thesis smile. (cache:~200M vs. no cache:~600M)

These are certainly great news.

raphi011 commented 5 years ago

Here you go: https://github.com/ElektraInitiative/libelektra/pull/3172

mpranj commented 5 years ago

Maybe we added a wrong suppression to valgrind?

Valgrind reported using no suppression when I ran the tests, but your mileage my vary.

I think I found the problem during my benchmarks. For me the problem is the eager allocation of the meta KeySet for each key. After a dirty test on my branch, removing the eager meta KeySet allocation the memory consumption quickly goes down after ksDel(). Maybe this is fixed by #3142 as it changes the code I'm talking about.

markus2330 commented 5 years ago

Ok, then let us see if #3142 fixes the problem!

@raphi011 did the problem also occur before #3081?

markus2330 commented 5 years ago

3142 is now merged. @raphi011 can you check if the problem still occurs?

raphi011 commented 5 years ago

Unfortunately yes

mpranj commented 5 years ago

For me it seems to at least solve/mitigate some problem. I can now run the benchmark with millions of keys using ~2GB memory, whereas it crashed before because of using >20GB memory for the same benchmark.

raphi011 commented 5 years ago

For me it seems to at least solve/mitigate some problem. I can now run the benchmark with millions of keys using ~2GB memory, whereas it crashed before because of using >20GB memory for the same benchmark.

I haven't compared it to the previous version yet, but I'm pretty sure that it uses less memory than before. Will check today!

Memory is still not freed though.

manuelm commented 5 years ago

You need to call malloc_trim(0) after every free (read: ksDel). This forces glibc to immediately return the memory to the OS. This should fix the "weird behavior" you guys are seeing. Oh and have fun reading and digging into glibc :-)

markus2330 commented 5 years ago

I created #3183 for further testing/fixing.

I found one memleak (in a kdbGet return keys were not freed).

But the increasing number of "ksClose did NOT free 532 keys" is probably only the warnings collected in the parentKey. If you make NUM_RUNS higher, e.g. 100, it stagnates at some point, as the number of warnings is limited to 100. For me it goes to max. "ksClose did NOT free 901 keys". Having a parentKey per handle would fix this problem.

raphi011 commented 5 years ago

closed by #3183