veselink1 / refl-cpp

Static reflection for C++17 (compile-time enumeration, attributes, proxies, overloads, template functions, metaprogramming).
https://veselink1.github.io/refl-cpp/md__introduction.html
MIT License
1.05k stars 76 forks source link

Faster properties #60

Closed veselink1 closed 2 years ago

veselink1 commented 2 years ago

This PR aims to fix the slow compilation uncovered in #58.

An optimisation is added to property-related utilities. The optimisation is based on the presumption that getters and setters are reflected (via the RELF_AUTO or REFL_FUNC macros) one after the other. The optimisation consists of checking the neighbouring members first, before resorting to a linear search.

Applies to:

REFL_AUTO(
    type(Point),
    func(get_x, property()),
    func(set_x, property()),
    func(get_y, property()),
    func(set_y, property())
)

But does NOT apply to:

REFL_AUTO(
    type(Point),
    func(get_x, property()),
    func(get_y, property()),
    func(set_x, property()),
    func(set_y, property())
)

I have added a bench/ tree, which will be used for benchmarks. Only one benchmark exists at the moment - bench-large-pod.cpp, which iterates over the members of a large POD with getters and setters and matches property getters to setters via get_reader/writer.

Results of compilation of bench-large-pod.cpp:

Without the optimisation:

Command being timed: "make large-pod"
User time (seconds): 41.69
System time (seconds): 1.01
Percent of CPU this job got: 99%
Maximum resident set size (kbytes): 3440424

With the optimisation:

Command being timed: "make large-pod"
User time (seconds): 10.38
System time (seconds): 0.42
Percent of CPU this job got: 99%
Maximum resident set size (kbytes): 1319484
veselink1 commented 2 years ago

With the latest changes, compilation times for bench-large-pod.cpp (100 properties) have improved as follows:

Compiler (-02) Time (v0.12.1) Time (faster-properties) Memory (v0.12.1) Memory (faster-properties)
gcc 9 39.44s 4.51s 3620956 KB 567980 KB
clang 10 49.65s 3.56s 2489668 KB 741472 KB

Time to compile was reduced by at least 88% (in the case of gcc) and peak memory usage was reduced by at least 84% (in the case of gcc).

bench-large-pod-search.cpp saw even more improvement. This version of the above benchmark exercises the slower code path in get_reader/writer. It could not compile in my VM due to OOM (8 GiB RAM available). It now compiles happily and a little slower than the regular version (which has getters and setters defined next to each other - a heuristic for which was added in 3547e6e).

Compiler (-02) Time (v0.12.1) Time (faster-properties) Memory (v0.12.1) Memory (faster-properties)
gcc 9 N/a 8.51s N/a 848516 KB
clang 10 N/a 10.20s N/a 1293552 KB
veselink1 commented 2 years ago

Closes #58.