Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
1.99k stars 573 forks source link

Evaluate memory overhead of icinga::Locked<...> in object attributes #10113

Open julianbrost opened 1 month ago

julianbrost commented 1 month ago

As a countermeasure for race conditions, #9364 added a mutex for every object attribute with a type that's incompatible with std::atomic. At the moment, that's implemented using a dedicated std::mutex for every attribute for every single object. On my machine, sizeof(std::mutex) = 40, and if I compare the sizeof(icinga::Host) with and without these mutexes, that's a 70% increase. However, that won't result in a 70% increase in memory usage of Icinga 2 as a whole (for example, all strings like object names are dynamically allocated and thus not part of icinga::Host itself and aren't affected by this increase.

Tasks

  1. Figure out how much of an effect this has on the total memory use of Icinga 2.
  2. Improve this. One idea would be to take some inspiration from how something like atomic_load(const std::shared_ptr<T>*) is/can be implemented:

    These functions are typically implemented using mutexes, stored in a global hash table where the pointer value is used as the key.

    Note that if using only part of the address as the key, i.e. sharing the mutex between objects, this would reduce the memory requirements.

Al2Klimov commented 3 weeks ago
Size in bytes (pointers) M3 Mac NixOS x64 laptop
sizeof(std::mutex) 64 (8) 40 (5)
alignof(std::mutex) 8 (1) 8 (1)

.ti attributes of a Service: about 90. Ex. numbers and bools: about 30. (icinga2 console > Service(), then grep -v | wc -l)

So, every Service consumes +1.2KB (30 x 40) with Locked<>.

julianbrost commented 3 weeks ago

That gives no estimate on the big scale of things, i.e. how this affects the overall memory usage. There are more object types than just Host and Service affected by this.

yhabteab commented 2 weeks ago
  • Figure out how much of an effect this has on the total memory use of Icinga 2.

I've been testing this for the entire week now and couldn't find a way to exactly determine the differences with and without this mutex. Attaching GDB to the running icinga2 process and calling malloc_stats() was promising, but then we found out that the output is next to useless as it only shows the virtual memory usage and not the actual physically allocated ones. So I just did it with the plain simple htop command and here is the result:

Setup (Debian 12 (Icinga 2 linked to jemalloc)):