valkey-io / valkey

A flexible distributed key-value datastore that is optimized for caching and other realtime workloads.
https://valkey.io
Other
17.19k stars 646 forks source link

Investigate removing boolean values throughout the global structure #590

Open madolson opened 5 months ago

madolson commented 5 months ago

We frequently use the 4 byte int type for our boolean values, which is 32x more memory than is really needed. There is no technical issues with using extra memory, however, it would likely generally reduce L1 cache thrashing if these booleans were compacted together in a single block of memory.

These seem to mostly be used for config values, like this one.. Some configs are also set in debug, like this one..

We should investigate a way to back all of these boolean configurations into a couple of blocks of memory. One method discussed was to use bitfields. e.g. int config_1 : 1. However, this does not work with the current structure of configs, which requires a pointer to the actual value. One solution might be to have a block of bits allocated in server.h, and we can access it with an enum + helper methods, like:

enum bitFieldIDs {
    CONFIG_AOF_ENABLED,
    ....
}

struct server {
    uint64_t server_bit_field[3]; / *supports 196 values */
}

#define getBitValue(bit_id) \
    server.server_bit_field[(bit_id)/64] & 1 << (bit_id  % 64) /* I didn't check this is right, don't quote me on it. */

#define setBitValueOn(bit_id) \
    server.server_bit_field[(bit_id)/64] |= 1 << (bit_id % 64)

#define setBitValueOff
    server.server_bit_field[(bit_id)/64] &= ~(1 << (bit_id % 64))

The config system for bools could be re-designed around the idea that for configs, you pass in a bitField + bitField ID.

Maybe someone else has a better idea.

PingXie commented 5 months ago

+1 on bitfields (for all flags). I was just thinking the same https://github.com/valkey-io/valkey/pull/592/files#r1623831868

zuiderkwast commented 5 months ago

The bit juggling looks a bit complex for the configs where we currently need a pointer.

I like bit fields, but if we currently need a pointer to each of them, can't we just use bool for the bool configs and keep the rest of the config logic as is?

madolson commented 5 months ago

I like bit fields, but if we currently need a pointer to each of them, can't we just use bool for the bool configs and keep the rest of the config logic as is?

Yes. It saves about 3 bytes. Actually, it takes us from 188 bytes -> 47 bytes (Which is inside a cache line!). Doing anything more is probably overly engineering. Also maybe worth noting that the main server struct doesn't use that much memory:

Size of main struct: 5736
Size of cluster struct: 395592

The main cluster struct is comparatively massive.

zuiderkwast commented 5 months ago

Of course we need to focus more on other structs that are more frequently used (not singletons).

I'd say the client struct can be compacted a lot.

madolson commented 5 months ago

I'd say the client struct can be compacted a lot.

I took a quick look at it when I opened the issue. The only long hanging fruit was the authenticated field that was already addressed. There are a lot of 8 byte fields those that are uncommonly used but hard to optimize out.