getml / reflect-cpp

A C++20 library for fast serialization, deserialization and validation using reflection. Supports JSON, BSON, CBOR, flexbuffers, msgpack, TOML, XML, YAML / msgpack.org[C++20]
https://getml.github.io/reflect-cpp/
MIT License
900 stars 76 forks source link

Support for Enum Flags #27

Closed liuzicheng1987 closed 8 months ago

liuzicheng1987 commented 8 months ago

The basic requirement is as follows. Suppose you had an enum like this:

enum class WindowsState: uint32
{
  maximised = uint32(1)<<0,
  borderless = uint32(1)<<1,
  topmost = uint32(1)<<2
};

People then expect to able to do this:

rfl::json::write(WindowsState::borderless | WindowsState::topmost);

And get the following string:

"borderless|topmost"

Likewise, we should be able to read the string using rfl::json::read and reproduce the original value.

There is an agreement that it is reasonable to restrict flag enums to values that are 1 or multiples of 2 (for technical reasons that are explained below, some restrictions are necessary).

For example:

enum class WindowsState: uint32
{
  maximised = uint32(1)<<0,
  borderless = uint32(1)<<1,
  topmost = uint32(1)<<2,
  borderless_and_topmost = (uint32(1)<<1) | (uint32(1)<<2)
};

rfl::json::write(WindowsState::borderless_and_topmost);

This would also result in the following string:

"borderless|topmost"

And not:

"borderless_and_topmost"

People are OK with that.

Open question

The remaining question is how to communicate to the compiler that something needs to be treated as a flag.

Possibility 1 is this syntax. The user would have to place that somewhere in the code:

rfl::is_flag_enum<WindowsState>: std::true_type{};

Possibility 2 is not to force the user to do anything and just have the library try both options.

magic_enum appears to be using possibility 1:

https://github.com/Neargye/magic_enum/blob/master/include/magic_enum/magic_enum.hpp

And much like the ideas proposed in here, they restrict themselves to 1 or multiples of 2.

Implementation details

In order to implement this, you would have to understand how serializing enums works in the first place.

The most important file is this one:

https://github.com/getml/reflect-cpp/blob/main/include/rfl/internal/enums/get_enum_names.hpp

You would also have to understand what rfl::Literal is:

https://github.com/getml/reflect-cpp/blob/main/docs/literals.md

There are two problems we have to solve:

1) Given an enum MyEnum{ option1, option2, ...}, how do we figure out how many and which options there are?

2) Given an enum value MyEnum::option1, how do we get the name "option1" as a string?

Problem 1 is solved by brute-force iteration. This is what is happening in get_enum_names. If the underlying type of the enum is fixed (like it is for all scoped enums), then you can always call static_cast<MyEnum>(some_integer) and this behaviour is defined. If some_integer matches option1 in the enum, then static_cast<MyEnum>(some_integer) is equivalent to having MyEnum::option1. This brute-force iteration takes place at compile time. This is the main reason there needs to be some kind of limit on what the enum values can be.

Basically it works like this: We iterate through the integers at compile time get the string representation of static_cast<MyEnum>(i). Based on that string representation, we can decide whether this is a proper enum option or not. If it is a proper enum option, it is added to rfl::Literal and our std::array which contains the enums.

This is what get_enum_names does.

Problem 2 is solved in get_enum_name_str_view, which returns a std::string_view of the enum_name. This works by employing std::source_location::current().function_name() and passing the enum value as a template parameter to the function. It will then show up in func_name and all we have to do is get it from func_name.

get_enum_name_str_lit just transforms the string view into our rfl::internal::StringLiteral, which we need to pass through rfl::Literal.

All we have to do is to rewrite get_enum_names() such that it doesn't iterate through 0,1,2,3,... but instead it iterates through 1,2,4,8,16,.... the ranges should be determined based on the bit size of the underlying fixed type.

How ChatGPT explains the code

Here is how ChatGPT explains the code in https://github.com/getml/reflect-cpp/blob/main/include/rfl/internal/enums/get_enum_names.hpp:

This code appears to be a part of a C++ library (RFL) focused on handling reflection for enums. Let's break it down:

The key points in the code involve compile-time reflection for enums. It iterates through possible enum values (_i) up to a maximum value. For each value, it tries to get the name using get_enum_name<static_cast<EnumType>(_i)>(). If the name starts with '(', it's considered an empty string and iteration continues; otherwise, it creates a new array of enum values and names.

The conditions in get_enum_names() are ensuring that the enum type is a scoped enum (using class or struct) and that its underlying type is integral. If these conditions aren't met, it will trigger static assertions.

The code essentially aims to create a compile-time list of enum values and their corresponding names, assuming the enums follow certain conventions (scoped, integral underlying type). This can be useful for various purposes like serialization, logging, or any situation where you'd need enum names alongside their values.

liuzicheng1987 commented 8 months ago

This issue has been resolved with v0.2.0.