Open mlauss2 opened 1 month ago
Seems useful. This mostly touches the reverse-engineered code (including the Jedi allocator). I went through the changes, but might need to patch and test it for a bit to be sure. :)
I did build and run test it on windows (together with the other PRs), and it does run fine there as well. I don't know whether MSVC supports lto though :)
I did build and run test it on windows (together with the other PRs), and it does run fine there as well. I don't know whether MSVC supports lto though :)
I'm not too worried about that, as long as the code compiles and runs correctly (and with the same performance). After some additional testing, I will probably merge it in after the next release (sometime this week).
While building with -flto and gcc on Linux does succeed, the resulting binary crashes at seemingly random places. This set of 3 patches removes all warnings emitted by GCC during the LTO run, and the resulting binary actually works fine as well (played a few levels, mods, saved/loaded savegames from both the lto-patched and vanilla binary, no issues whatsoever).
Comparison of binary size (-O3 -march=znver4):
Size reduction of 1.54MiB or ~17,5%
The thing that LTO was most confused by is apparently the wraparound-to-nullptr design in the TFE_Jedi Allocator, I changed it a bit to use regular nullptrs. The rest is namespace cleanups to get rid of C++ ODR violations which only appear with lto enabled, and nullpointer checks to call allocator_newItem() callsites (which fixes an lto-only allocator pointer corruption encountered with dark tide 1 mod).
Clang LTO also works, but generates a 10% bigger binary (compared to non-lto build).
What do you think?