out of memory with many KDB handles

raphi011 commented 4 years ago

While running my benchmarks I somehow broke elektra.

When I run a kdb command like kdb ls /

I get the following message (also when calling the elektra library directly)

Sorry, I crashed by the signal SIGSEGV
This should not have happened!

Please report the issue at https://issues.libelektra.org/
fish: 'kdb ls /' terminated by signal SIGABRT (Abort)

I've already run kdb reset and kdb reset-elektra and re-compiled and uninstalled + installed kdb. Any ideas how I could fix / debug this?

mpranj commented 4 years ago

Thank you for reporting. Can you attach a trace, maybe run valgrind kdb ls /? Some information on how to reproduce it would also be helpful.

raphi011 commented 4 years ago

==110109== Memcheck, a memory error detector
==110109== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==110109== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==110109== Command: kdb ls /
==110109== 
==110109== Invalid read of size 8
==110109==    at 0x52778E8: updatePointers (mmapstorage.c:948)
==110109==    by 0x52778E8: libelektra_mmapstorage_LTX_elektraPluginget (mmapstorage.c:1148)
==110109==    by 0x5270711: elektraCacheGet (cache.c:321)
==110109==    by 0x48E6849: elektraCacheLoad (kdb.c:825)
==110109==    by 0x48E6849: kdbGet (kdb.c:1038)
==110109==    by 0x174EA0: get (kdb.hpp:185)
==110109==    by 0x174EA0: get (kdb.hpp:168)
==110109==    by 0x174EA0: Cmdline::Cmdline(int, char**, Command*) (cmdline.cpp:310)
==110109==    by 0x1696F6: main (main.cpp:170)
==110109==  Address 0x5285d10 is not stack'd, malloc'd or (recently) free'd
==110109== 

Sorry, I crashed by the signal SIGSEGV
This should not have happened!

Please report the issue at https://issues.libelektra.org/
==110109== 
==110109== Process terminating with default action of signal 6 (SIGABRT): dumping core
==110109==    at 0x4CDEF25: raise (in /usr/lib/libc-2.30.so)
==110109==    by 0x4CC8896: abort (in /usr/lib/libc-2.30.so)
==110109==    by 0x1DF04B: catchSignal(int) (main.cpp:110)
==110109==    by 0x4CDEFAF: ??? (in /usr/lib/libc-2.30.so)
==110109==    by 0x52778E7: updatePointers (mmapstorage.c:946)
==110109==    by 0x52778E7: libelektra_mmapstorage_LTX_elektraPluginget (mmapstorage.c:1148)
==110109==    by 0x5270711: elektraCacheGet (cache.c:321)
==110109==    by 0x48E6849: elektraCacheLoad (kdb.c:825)
==110109==    by 0x48E6849: kdbGet (kdb.c:1038)
==110109==    by 0x174EA0: get (kdb.hpp:185)
==110109==    by 0x174EA0: get (kdb.hpp:168)
==110109==    by 0x174EA0: Cmdline::Cmdline(int, char**, Command*) (cmdline.cpp:310)
==110109==    by 0x1696F6: main (main.cpp:170)
==110109== 
==110109== HEAP SUMMARY:
==110109==     in use at exit: 173,723 bytes in 1,005 blocks
==110109==   total heap usage: 2,915 allocs, 1,910 frees, 437,331 bytes allocated
==110109== 
==110109== LEAK SUMMARY:
==110109==    definitely lost: 0 bytes in 0 blocks
==110109==    indirectly lost: 0 bytes in 0 blocks
==110109==      possibly lost: 0 bytes in 0 blocks
==110109==    still reachable: 173,723 bytes in 1,005 blocks
==110109==         suppressed: 0 bytes in 0 blocks
==110109== Rerun with --leak-check=full to see details of leaked memory
==110109== 
==110109== For lists of detected and suppressed errors, rerun with: -s
==110109== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
fish: 'valgrind kdb ls /' terminated by signal SIGABRT (Abort)

raphi011 commented 4 years ago

Seems to be an issue with mmap. After running rm -rf ~/.cache/elektra it works again

mpranj commented 4 years ago

Yes, the crash definitely happened because of cache / mmap. Can you explain what you did to trigger this?

raphi011 commented 4 years ago

I ran my elektrad benchmarks, after clearing the cache and re-running it didn't break again. If this occurs again I will mention it in this issue.

markus2330 commented 4 years ago

@mpranj a big thanks that you take care of this issue.

Why are there casts to char* in the segfaulting code?

mpranj commented 4 years ago

Why are there casts to char* in the segfaulting code?

Pretty sure the cast is not the problem. It's used for byte-wise pointer arithmetic. Before storage we subtract the old base address, making all pointers relative to the base address of the mmap region. After mmaping again, we update the pointers to the new base mmap address by adding the new address to all pointers.

markus2330 commented 4 years ago

Ok. Do you have an idea how this problem was caused?

mpranj commented 4 years ago

Do you have an idea how this problem was caused?

Sorry but currently no. This code is basically executed on each kdbGet now, so it's not like a seldom taken code path.

Corrupting the cache file in exactly the right spot might cause something like that (like a wrong ks->size or similar).

markus2330 commented 4 years ago

Are the operations on the cache files atomic? Did you test aborting the process while writing the cache?

mpranj commented 4 years ago

There are no testcases for such a scenario, but the cache file is written to a temp file and only rename()d on success.

EDIT: Additionally, the magic footer of the mmap file woud prevent us from reading an unfinished file. The footer should be the last write. I could enforce it via an additional msync(), but it might slow things down.

markus2330 commented 4 years ago

Additionally, the magic footer of the mmap file woud prevent us from reading an unfinished file.

So most probably it was something else.

I could enforce it via an additional msync()

The sync/msync would be the correct thing to do (also on the folder where you rename, see sync plugin).

but it might slow things down.

As it only impacts writing, you do not need to worry.

mpranj commented 4 years ago

Just to clarify: msync is done already. This would just mean another msync to enforce order of the writes. Let's do it and see what it does in the benchmarks.

Still: It's unclear that this is the reason for the segfault here.

markus2330 commented 4 years ago

@raphi011 you do not have the cache file anymore to reproduce this issue? Did you do some upgrade of Elektra or similar?

raphi011 commented 4 years ago

@raphi011 you do not have the cache file anymore to reproduce this issue? Did you do some upgrade of Elektra or similar?

Unfortunately not ..

mpranj commented 4 years ago

The commits referencing this issue are a shot in the dark. The assumption was that the OS reordered some writes from mmap and mmapstorage did not correctly detect incompletely written data, which was possible before those commits.

Even having the cache files would probably not help as there is no way to know whether some data was not completely written.

raphi011 commented 4 years ago

What is the actual lifetime of the keys in mmap storage? While benchmarking the memory of my server is growing and growing even though I'm ksClearing all keysets after it's usage - until it get's killed by the OS because it uses too much memory.

When is memory that is allocated with mmap freed?

mpranj commented 4 years ago

When is memory that is allocated with mmap freed?

The answer is never. This was a design decision discussed with @markus2330. The memory pages are shared and the OS implements COW via mmap. This means depending on the usage it is possible to have a KeySet open which is larger than the available main memory.

The alternative back then was to implement reference counting to unmap the mmaped files which needed some tricky implementation in the core. (Think of another KeySet referencing the keys from a mmaped KeySet, it is tricky to know when it can be really unmapped and basically out of scope for my thesis)

Maybe this is the reason why you had incomplete mmap cache files. 🤔

raphi011 commented 4 years ago

@markus2330 this is probably not the right place to discuss this .. but how is this supposed to work with elektrad that has long lived handles? After a session expires a new handle + keyset is created and the old one is closed - but the memory is not freed which results in the OS killing the process after a while (which happens pretty fast in the benchmarks).

This probably means I have to disable mmap somehow for elektrad .. is this possible?

mpranj commented 4 years ago

The OS can free up memory from mmap that has not been changed, so we "only" leak the pages that are changed COW. But you're right, we might need to re-evaluate this decision.

This probably means I have to disable mmap somehow for elektrad .. is this possible?

Easy: kdb cache disable. The cache is stored on a per-system-user basis, so this needs to be also done on a per-user basis. If you need to completely disable it then compile without the cache plugin (e.g. -DPLUGINS="ALL;-cache").

mpranj commented 4 years ago

The alternative back then was to implement reference counting to unmap the mmaped files which needed some tricky implementation in the core. (Think of another KeySet referencing the keys from a mmaped KeySet, it is tricky to know when it can be really unmapped and basically out of scope for my thesis)

@markus2330 One problem was/is that unmapping would have to be implemented in the core, and mmap() is not part of the C standards, but let's assume this is no problem. Then unmapping can be done somewhat easily in the core if ksAppend() really copies the keys and does not simply reference them. Referencing mmaped Keys was afair the biggest problem with this unmapping approach. I'm not sure if we can/may change the semantics of ksAppend() though.

raphi011 commented 4 years ago

I have disabled the cache plugin completely - as described above via -DPLUGINS="ALL;-cache" and somehow memory is STILL not freed after calling ksDel

mpranj commented 4 years ago

If you compiled Elektra cleanly with the cmake option -DPLUGINS="ALL;-cache" (verify by cmake output "Exclude Plugin cache because explicitly excluded"), then I see no way how mmap could affect your results, except if you explicitly enabled mmapstorage as default storage (again, can be excluded by -DPLUGINS="ALL;-cache;-mmapstorage" ).

If you're sure we're leaking somewhere, then you can always run valgrind --leak-check=full /your/kdb/tests and see which source file has leaked memory (make sure you have debug symbols, which is default anyway: CMAKE_BUILD_TYPE: RelWithDebInfo).

markus2330 commented 4 years ago

@mpranj wrote:

and basically out of scope for my thesis

I fully agree. For now we only find ways of how to disable the cache when it does not work. I created #3155 and #3156

@raphi011 wrote:

this is probably not the right place to discuss this

The issue tracker is the right place. But please always make sure to either rename the title (if the previous discussion is finished) or open new issues.

which happens pretty fast in the benchmarks

Can you tell us about how many KDB handles (sessions) we are speaking here? Maybe this is only a synthetic (benchmark) problem anyway. In real-world scenarios not more than a few handles are expected to be alive at the same time. (elektrad is there to configure a single system.)

even though I'm ksClearing all keysets after it's usage I have disabled the cache plugin completely - as described above via -DPLUGINS="ALL;-cache" and somehow memory is STILL not freed after calling ksDel

ksClear is internal and afaik there are no tests if it is safe to reuse the keyset afterwards. So maybe you already lost pointers to allocated memory within ksClear. If this is the case, please open a new issue. (Lets keep this issue to discuss the problems with mmap and too many handles.)

raphi011 commented 4 years ago

I will close this issue since a fix has been added. If this happens again please feel free to reopen it.

ElektraInitiative / libelektra

out of memory with many KDB handles #3140