CesiumGS / cesium-unreal

Bringing the 3D geospatial ecosystem to Unreal Engine
https://cesium.com/platform/cesium-for-unreal/
Apache License 2.0
921 stars 295 forks source link

Use C++20 and swl::variant instead of std::variant. #1455

Closed kring closed 3 months ago

kring commented 3 months ago

In my limited tests this drastically improves compile times.

So far cesium-native is still using C++17 and std::variant. I've only moved cesium-unreal to C++20 (it was already there for UE5.4, but now 5.2 and 5.3 build with C++20 too). And because swl::variant requires C++20, I've only moved the variants in the Unreal-specific code to swl::variant. But even with that limited change, Linux compile times on my system were cut by 35%.

This is a PR into #1447 which I'll probably merge myself if CI is happy on all platforms.

kring commented 3 months ago

Welp.... the performance on Linux is excellent. The complete plugin build on Linux went from 28 minutes 19 seconds down to 20 minutes 8 seconds. So it cut off a third of the compile time, and I'm pretty confident that would improve further if the large variants in cesium-native moved to swl, too.

Unfortunately, it doesn't really work on any of our other platforms. 😆 I'm not sure how fixable that is. It's a bit strange because every platform except Windows uses Clang. Apple and Android use v14 while Linux uses v15, so maybe that's the problem.

kring commented 3 months ago

The size of the Linux binaries in this branch are also significantly smaller. Total ZIP size is down from 188 MB to 167 MB. Uncompressed size goes down from 903 MB to 633 MB.

csciguy8 commented 3 months ago

The complete plugin build on Linux went from 28 minutes 19 seconds down to 20 minutes 8 seconds. The size of the Linux binaries in this branch are also significantly smaller. Total ZIP size is down from 188 MB to 167 MB. Uncompressed size goes down from 903 MB to 633 MB.

Nice. Worth defining variant differently per platform to work around the problems?

kring commented 3 months ago

Worth defining variant differently per platform to work around the problems?

If we can get it working on macOS, then yes, I think it's worth it. MSVC's std::variant implementation seems to be less terrible (or perhaps the compiler is better at dealing with the terrible), so the compile times and binary bloat aren't as bad there. But if we can get gains on macOS like we did on Linux, it'll shave close to 40 minutes off the CI time for that platform.

I looked into it a little more, and the immediate problem seems to be bugs the Clang 14's C++20 implementation such that the swl::variant can't hold instances with non-trivial destructors. The only thing in a variant with a non-trivial destructor currently (AFAICT) is PropertyArrayView, because it has the dual owning (std::vector) / non-owning (gsl::span) representation. std::vector of course has a non-trivial destructor. It might be possible to make the owning variant of PropertyArrayView a separate type that is not stored in a swl::variant. Maybe. That's what I'm trying, anyway.

jherico commented 3 months ago

Possibly caused by this: https://stackoverflow.com/a/70997041/85306 and also you might be able to work around it with a define of SWL_VARIANT_NO_CONSTEXPR_EMPLACE

kring commented 3 months ago

With some hackery I got it building on Android and macOS.

The plugin build for Android / UE 5.2 went from 24 min 58 s down to 11 min 3 s. And the zipped package went from 168 MB down to 108 MB.

The plugin build for macOS / UE 5.2 went from 1 hour 6 min 57 s down to 17 min 22s! I think on macOS we were previously hitting swap pretty hard, that's why the improvement is so much more drastic. Also the zipped macOS package went from 213 MB down to 148 MB.

My hacks are too dodgy to ship, but I think this proves that spending some more time on this will pay off big time in terms of reducing CI time and therefore improving our productivity.

kring commented 3 months ago

Looks like it makes a huge difference on Windows, too. It goes from 39 min 17 s down to 12 min 23 s.

kring commented 3 months ago

I'm really happy with this and am going to merge this into the vcpkg. Here are the overall numbers:

Build Main Time This Branch Time Speedup Main Size (MB) This Branch Size Size Change
Windows 5.2 39m 34s 13m 6s 202% 194 284 +46%
Windows 5.3 44m 14s 14m 33s 204% 191 291 +52%
Windows 5.4 50m 11s 18m 36s 170% 205 304 +48%
Linux 5.2 26m 10s 17m 39s 48% 182 120 -34%
Linux 5.3 27m 24s 18m 15s 50% 181 120 -34%
Linux 5.4 33m 9s 20m 53s 59% 181 122 -34%
Android 5.2 23m 28s 12m 33s 87% 127 111 -13%
Android 5.3 22m 31s 11m 38s 94% 127 112 -12%
Android 5.4 44m 11s 23m 33s 88% 235 170 -28%
macOS 5.2 56m 41s N/A N/A 204 N/A N /A
macOS 5.3 1h 32m 42s N/A N/A 397 N/A N /A
macOS 5.4 1h 43m 55s N/A N/A 413 N/A N /A
iOS 5.2 37m 55s N/A N/A 185 N/A N /A
iOS 5.3 38m 6s N/A N/A 191 N/A N /A
iOS 5.4 48m 9s N/A N/A 199 N/A N /A
macOS+iOS 5.2 N/A 43m 33s 117% N/A 383 -2%
macOS+iOS 5.3 N/A 51m 6s 156% N/A 387 -34%
macOS+iOS 5.4 N/A 58m 36s 159% N/A 430 -30%

What exactly is this table comparing?

The numbers for this branch are from this build: https://github.com/CesiumGS/cesium-unreal/actions/runs/9637521085

The numbers for main are from this build: https://github.com/CesiumGS/cesium-unreal/actions/runs/9599979122

In all cases, the times are to build the plugin only. Sizes are the size of the distribution ZIP produced by CI for that platform.

Why did the Windows distributions get bigger in this branch?

Two reasons:

  1. This branch is using vcpkg, which means we're including more libraries for cesium-native and its dependencies than we were in main. For example, we include all of the Abseil libraires instead of just the tiny bit of it that s2geometry actually needs.
  2. Binary size doesn't seem to explode with our previous variant implementation, the way it does with Clang.

So while this branch is an improvement even for Windows when compared against vcpkg without the improvements in this branch, the binaries are still larger than they used to be overall.

What's going on with the macOS and iOS builds?

In main, we have separate CI jobs for macOS and iOS. In this branch, they've been combined together. Combining them is a pretty big win for us because GitHub limits us to 5 simulteneous Apple builds. If you use 6 (three UE versions times macOS + iOS), then they are necessarily serialized. And when they're serialized, we end up paying the substantial "install UE" cost twice for zero benefit.

The percentage improvement columns for the macOS+iOS rows have been computed based on the sum of the separate plugin builds in main. These improvements do not count the speedup we get from eliminating the extra UE install (so the actually improvement is higher).

Why is the Android 5.4 build bigger than the other two?

It includes both ARM64 and x86-64 binaries for the plugin, because Unreal changed a default. This is true both in main and in this branch. However, it currently does not include cesium-native binaries, so attempting to package for Android/x86-64 will fail.

Why are the improvements for UE 5.2 on macOS smaller than for the other two UE versions?

By default, UE 5.2 builds only for the host's processor on macOS, which in our case was x86-64. In 5.3 and 5.4, Epic changed the default to build for both platforms regardless of host architecture. I've changed this branch to explicitly build for both x86-64 and ARM64 even in UE 5.2, so it's actually running faster and producing smaller binaries than main while also supporting an additional processor architecture that it wasn't before.