teslamotors / fixed-containers

C++ Fixed Containers
MIT License
388 stars 35 forks source link

Suggestion: Add bit_set container (and considering using bit_set in enum containers) #156

Open abeimler opened 1 month ago

abeimler commented 1 month ago

Hello, I was experimenting with bit_set in my own fork, replaced the internal std::array<bool> members with bit_set.

_(Right know I'm using bit_set, I needed a more "modern" bit_set with fixed length and constexpr)_

1. Using bit_set as array_set_ in my EnumSet:

    using Block = std::size_t;
    using ValueArrayType = xstd::bit_set<ENUM_COUNT, Block>;
ValueArrayType IMPLEMENTATION_DETAIL_DO_NOT_USE_array_set_;
    std::size_t IMPLEMENTATION_DETAIL_DO_NOT_USE_size_;

2. Using bit_set as array_set_ in my EnumMap

    using ArraySetType = xstd::bit_set<ENUM_COUNT, std::size_t>; // size is included
    ArraySetType IMPLEMENTATION_DETAIL_DO_NOT_USE_array_set_;

My main goal was to reduce the memory-foot-print, but the best thing I can do was sizeof and benchmark for speed :)

EnumMap (random access)

Without USE_BIT_SET_IN_ENUM_MAP (std::array<bool>)

BM_fixed_enum_map_random_access/8              4000 ns         3993 ns       177449
BM_fixed_enum_map_random_access/16             7960 ns         7914 ns        89354
BM_fixed_enum_map_random_access/32            16342 ns        16213 ns        43791
BM_fixed_enum_map_random_access/64            32084 ns        32046 ns        21704
BM_fixed_enum_map_random_access/128           64522 ns        64266 ns        11166

With USE_BIT_SET_IN_ENUM_MAP (bit_set)

BM_fixed_enum_map_random_access/8              4013 ns         3996 ns       177443
BM_fixed_enum_map_random_access/16             7851 ns         7864 ns        86487
BM_fixed_enum_map_random_access/32            15878 ns        15782 ns        44156
BM_fixed_enum_map_random_access/64            32220 ns        32028 ns        22074
BM_fixed_enum_map_random_access/128           63555 ns        63114 ns        11180

EnumSet (random access)

Without USE_BIT_SET_FOR_ENUM_SET (std::array<bool>)

BM_fixed_enum_set_random_access/8                   4026 ns         4009 ns       176985
BM_fixed_enum_set_random_access/16                  8033 ns         7972 ns        87107
BM_fixed_enum_set_random_access/32                 16279 ns        16204 ns        42889
BM_fixed_enum_set_random_access/64                 31971 ns        31930 ns        21821
BM_fixed_enum_set_random_access/128                63456 ns        63296 ns        11112

With USE_BIT_SET_FOR_ENUM_SET (bit_set)

BM_fixed_enum_set_random_access/8                   3875 ns         3875 ns       180472
BM_fixed_enum_set_random_access/16                  7735 ns         7726 ns        90359
BM_fixed_enum_set_random_access/32                 15442 ns        15451 ns        45212
BM_fixed_enum_set_random_access/64                 30907 ns        30987 ns        22595
BM_fixed_enum_set_random_access/128                61792 ns        61913 ns        11308

EnumMap (construct)

Without USE_BIT_SET_IN_ENUM_MAP (std::array<bool>)

BM_fixed_enum_map_construct_IntEnumMap/8                60.5 ns         60.2 ns     11655430 sizeof=88
BM_fixed_enum_map_construct_IntEnumMap/16                121 ns          120 ns      5825911 sizeof=88
BM_fixed_enum_map_construct_IntEnumMap/32                252 ns          251 ns      2824840 sizeof=88
BM_fixed_enum_map_construct_IntEnumMap/64                496 ns          493 ns      1414966 sizeof=88
BM_fixed_enum_map_construct_IntEnumMap/128               992 ns          985 ns       704682 sizeof=88
BM_fixed_enum_map_construct_SmallIntEnumMap/8           4.26 ns         4.23 ns    161605338 size=0 sizeof=32
BM_fixed_enum_map_construct_SmallIntEnumMap/16          8.44 ns         8.39 ns     82856699 size=0 sizeof=32
BM_fixed_enum_map_construct_SmallIntEnumMap/32          25.8 ns         25.7 ns     27258328 size=0 sizeof=32
BM_fixed_enum_map_construct_SmallIntEnumMap/64          42.9 ns         42.6 ns     16904863 size=0 sizeof=32
BM_fixed_enum_map_construct_SmallIntEnumMap/128         75.3 ns         74.8 ns      9054899 size=0 sizeof=32

With USE_BIT_SET_IN_ENUM_MAP (bit_set)

BM_fixed_enum_map_construct_IntEnumMap/8                10.7 ns         10.7 ns     65530202 sizeof=72
BM_fixed_enum_map_construct_IntEnumMap/16               21.3 ns         21.3 ns     32515552 sizeof=72
BM_fixed_enum_map_construct_IntEnumMap/32               46.0 ns         45.9 ns     15207205 sizeof=72
BM_fixed_enum_map_construct_IntEnumMap/64               88.0 ns         87.8 ns      7924091 sizeof=72
BM_fixed_enum_map_construct_IntEnumMap/128               174 ns          174 ns      4047062 sizeof=72
BM_fixed_enum_map_construct_SmallIntEnumMap/8           4.29 ns         4.26 ns    166029405 size=0 sizeof=24
BM_fixed_enum_map_construct_SmallIntEnumMap/16          8.43 ns         8.39 ns     83284210 size=0 sizeof=24
BM_fixed_enum_map_construct_SmallIntEnumMap/32          24.9 ns         24.8 ns     28038435 size=0 sizeof=24
BM_fixed_enum_map_construct_SmallIntEnumMap/64          41.3 ns         41.1 ns     16950446 size=0 sizeof=24
BM_fixed_enum_map_construct_SmallIntEnumMap/128         75.4 ns         75.1 ns      9395719 size=0 sizeof=24

_I guess the more keys an Enum has, the larger the std::array gets ... with bit_set it's more limited._

alexkaratarakis commented 1 month ago

The reason std::array<bool, N> was originally used was due to constexpr-ness that std::bitset lacked/lacks. Later the requirement of StructuralType was also added.

Overall it would be better to use bits instead of bytes if it meets all the "fixed_container" requirements. Your fork seems to have the necessary requirements on bit_set. It seems quite a bit different from the original version though, perhaps forked long ago?

Performance seems to be an on-par for access, and construction favors the smaller one. I wouldn't be surprised if some things are slower given the access patterns, but the footprint is for sure much smaller (at least for EnumSet). I was wondering, is space the limiting factor in your application? EnumSet is cheap given that it doesn't even store the enum keys, and EnumMap is the same and is probably dominated by the values.

Are there any cons in always using bit_set in EnumSet?

abeimler commented 1 month ago

The reason std::array<bool, N> was originally used was due to constexpr-ness that std::bitset lacked/lacks.

I figure that's why I start using xstd::bit_set, but switched later to EnumSet (more speaking indexes and stronger typing etc.)

It seems quite a bit different from the original version though, perhaps forked long ago?

Yea I forked it years ago and update it from time to time, but some C++23 features (or other things) broke my build, so I rolled back. (older versions works fine for me)

Are there any cons in always using bit_set in EnumSet?

Right now it works for me, but the most operations I do are insert, erase and contains.
The other thing is serialization, it can break, depending on how the bits/bools are stored. _(It also can have a smaller footprint storing uint32_t bitset vs. array of bools)_ But for backwards-compatibility I just store the size, for-each the EnumSet and store the values.


I was experimenting with sizes and alignment to keep my structs as small as possible and feel like EnumSet (and EnumMap ?) gets a bit bloated.

For example... Without bit_set:

struct MyGameComponent {
    EnumSet<Status> statuses;            // Size: 48, Align: 8 (38 enums)
    EnumSet<CanDoAction> can_do_actions; // Size: 32, Align: 8 (18 enums)
}

struct BattleStats {
  EnumMap<Attributes, int32_t> base_stats; // Size: 48 (7 enums)
};

With bit_set

struct MyGameComponent {
    EnumSet<Status> statuses;            // Size: 16, Align: 8 (38 enums)
    EnumSet<CanDoAction> can_do_actions; // Size: 16, Align: 8 (18 enums)
}

struct BattleStats {
  EnumMap<Attributes, int32_t> base_stats; // Size: 40 (7 enums)
};

(size is also always the same, which can be nice.)

abeimler commented 1 month ago

In my fork, I used the MACRO USE_BIT_SET_FOR_ENUM_SET to enable the bit_set array_sets, maybe split the different impl. and types into it's own classes EnumSet, BitsetEnumSet, EnumMap, EnumMapWithBitsetKeys, ... (or something like that). or use template arguments to setup the option template<..., EnumSetOption::UseBitsetForArraySet> ...

abeimler commented 1 month ago

IMHO adding a (constexpr-able & fixed) bit_set in general can be a nice addition for this library and CAN be re-used in EnumSet.