Add support for double precision floats

aaronfranke commented 4 years ago

This proposal is a summary and formalization of various past discussions about double support, especially issue #288 which this proposal directly supersedes.

Describe the project you are working on:

This proposal affects any game working with large scale environments in 3D, meaning, any environment larger than a few kilometers. This proposal is especially important for games taking place in the vastness of space. The problem technically also exists in 2D, but it is far less of an issue.

Describe the problem or limitation you are having in your project:

Any 3D game in Godot with large scale environments will begin to experience jitter once the player moves more than a few kilometers away from the world origin. The problem is most noticeable in FPS games, since objects tend to be close to the camera, and jitter is more clearly visible. This is caused by the limitations of single-precision floats. There are some workarounds for some use cases, but the only proper fix is one that is done on the engine level.

Describe the feature / enhancement and how it helps to overcome the problem or limitation:

The core issue is that single-precision floating point numbers have a limited amount of precision, which is unsuitable for games that use large scales. Single-precision floats have 23 significant binary digits (they are 32-bit, 8 of the bits are used for the exponent and 1 bit is used for positive/negative). First-person shooter games depend on the world having better than about half a millimeter of precision. The formula 0.0005 * (2^23) shows us that errors big enough to notice appear approximately a few kilometers away from the world origin.

The solution, simply put, requires us to add more significant digits. Double-precision floats are 64-bit, with 52 of those bits being significant binary digits. This is 29 more significant binary digits than single-precision floats, which increases the maximum usable area by a factor of about half a billion, to about 2 Tm (2 billion km). We go from a fifth the length of Manhattan to an area greater than the orbital radius of Saturn, more than enough for 99.99% of games. (Of course, you don't have to use all that area up to see benefit, any game larger than a few kilometers will benefit from doubles).

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:

For now, the plan is for this to be a completely optional feature which is not enabled by default, to maintain high performance on older devices. Anyone who needs double support can compile their own version of the engine from source. The rest of this section describes the details of how this will work.

C++ has a keyword called typedef that allows aliasing of types. Godot already uses this for the real_t type used for vectors and many parts of the engine. Eventually, users will be able to compile the engine with real_t being aliased to double, which means that all vector math is done with doubles, including transformations and all physics code. Pull request #21922 is a stepping stone towards double support, fixing many of the issues that currently exist when trying to compile with doubles.

On the CPU, for the most part, doubles are equally as fast as single-precision floats. The x86 architecture does not have circuits for single-precision floats, the FPU elevates all floating-point types to an 80-bit extended precision format internally, and truncates the result. Doubles take up twice the amount of memory, which can be an issue for architectures not optimized for moving around pieces of 64-bit data (such as 32-bit architectures), but otherwise the total memory usage of the engine does not change very much. There is also the matter of SIMD vector instructions, designed to perform math in parallel. Godot does not currently use these, but if it did, full acceleration would require AVX2 (256-bit for 4 * 64-bit), which means Intel CPUs from 2013 or later, and AMD CPUs from 2015 or later. A Windows 11 compatible x86 CPU will have AVX2.

It's important to note that doubles cannot be used on the graphics card. Due to Nvidia intentionally crippling support for doubles on non-Quadro graphics cards, rendering has to be done with single-precision floats. The approach used by all games that use doubles is to do all of the CPU-side math in doubles, then take all coordinates and convert them to be relative to the camera, then pass this information to the GPU. The exact details of this will be left to @reduz to deal with.

If this enhancement will not be used often, can it be worked around with a few lines of script?:

No, it cannot be worked around in a few lines of script. However, let's explore what could be done.

A fair question to ask is how other games handle large scales.

Most games don't. It's true that this feature is only truly needed for a small amount of games, as the majority of games take place on scales smaller than a few kilometers. For games that need somewhat large scales, sometimes maps are designed around this constraint, to be square and approximately 4 kilometers in radius, such as PlanetSide 2's Indar map.
Kerbal Space Program (KSP) is a game created in Unity, which (like most engines) uses single-precision floats. The developers of KSP had to implement their own math types, doing a huge amount of calculations in user code. Even with all their effort, KSP struggled with floating-point issues for many years, and these issues came to be known as The Kraken. The ideal solution is for the engine to have first-class support.
A commonly cited technique is origin shifting. This involves moving the world around the player such that the player is always near the world origin. This technique can work, but it comes with many of its own limitations. For example, it doesn't always work for multiplayer, where the server needs to have precision for all players at once. There are many tricks to make this work better, but this heavily complicates things to the point that it's both easier and more efficient to use doubles.
Some games that use doubles for large scales include Star Citizen, Arma 3, Space Engineers, and Minecraft. Star Citizen uses a custom version of Amazon Lumberyard with double support added. Arma 3 uses their own in-house engine which they call "Real Virtuality" which uses doubles. Space Engineers and Minecraft both do not use an engine, but also, Minecraft in its early days (incorrectly) truncated the coordinates, which led to issues such as the jittering seen in Far Lands or Bust (explained here).
Unreal added support for doubles with the release of Unreal 5 to support planetary-scale games. There's also Unigine, which is focused on being an engine for simulations, and Unigine can use doubles.

Is there a reason why this should be core and not an add-on in the asset library?:

This is by nature a core engine feature, and it cannot be an add-on. However, if anyone wishes to take the limited KSP approach, I do have this repo with some math types for C#.

fire commented 2 years ago

We should add or link to a doc on how to build doubles Godot.

Sounds like a junior but important job and also something lets us see things with a fresh eye.

Gnollrunner commented 2 years ago

I seemed to have been following this thread and forgot about it. Since it just popped up, here is my two cents. First I've never even used a game engine so I probably don't know all the ins and outs of this. However I have programmed in C++ with DirectX using double to do planet sized stuff. I would rather use a game engine but there is nothing free-ish that really supports this which is a bit surprising to me.

I think it should be very doable. I do a few things in my code which seems to work pretty well. First go strait to view space. Since you don't have double on the GPU (or not to any great degree) I think this works best. Once you go to world space with float you lose all your precision. If you go directly to view space you only lose it away from the camera where it doesn't matter. I generate World-View matrices in double on the CPU and then truncate to float before sending them down to the GPU.

Second you of course need a very good LOD system. However I suppose this might be considered part of the application instead of the engine. In my code I'm using voxels and marching cubes with LOD transitions, which happens at run time, so it takes care of that, at least for terrain, but this is kind of specific. In any case you need to do something to prevent Z fighting.

Finally I take the projection stuff out of the matrix and do that in a post step. I found not doing this causes major instability. I basically scale everything down by a large power of 2 to get Z inside the box required by DirectX. The idea here is that it will only change the exponent bits on Z and not the precision bits going from view space to projection space. This seemed to fix a few issues.

These are just some ideas. Maybe this is all obvious but I thought I'd throw it out there since it works OK. So anyway if you guys decide to do this, I'll probably switch to Godot and hopefully contribute something.

albinaask commented 2 years ago

@Gnollrunner I think your projection trick may also be solved by the use of logarithmic depth buffers

fire commented 2 years ago

For adjacent work with voxel and level streaming. Zylann's work is notable. https://github.com/Zylann/godot_voxel/tree/godot4 https://github.com/Zylann/solar_system_demo

Here's Zylann's Solar System Demo.

isral commented 2 years ago

@fire Is Godot v4.0 alpha1 already support double precision?

fire commented 2 years ago

Let me ask @aaronfranke if he has the exact syntax.

Calinou commented 2 years ago

Is Godot v4.0 alpha1 already support double precision?

There is some limited support but it's not complete or production-ready yet.

aaronfranke commented 2 years ago

Here's an update on the status of this proposal:

I have not spent much time working on double support recently, but Zylann has been doing some great work recently fixing the most critical bugs such as binary resources not loading, the remote inspector not working, and the is_equal_approx tolerance being incorrect. You can view some of his double support PRs here.

Here's a video Zylann made that shows the current status (it includes merging the one PR linked above that hasn't been merged yet): https://www.youtube.com/watch?v=g5wwa5W5_Cw

It appears that mostly everything except rendering is working. From watching the cubes fall and interact we can see that they are moving at a smooth rate, so the physics system seems to be correctly using doubles.

The next step is that we need to have double support in rendering. Since using doubles on the GPU is not an option, this means we need a camera-relative rendering system to process positions relative to the camera on the CPU before we pass this information to the GPU. I'd like to invite @Gnollrunner @clayjohn @lawnjelly and @reduz to take a look at this as these people all have rendering experience and are interested in double support.

Aside from that, there are still some non-rendering bugs to fix that we know about, in particular Zylann's PR about fixing is_equal_approx tolerance breaks some of the unit tests. So progress is not completely blocked by rendering.

In its current state, the float=64 option is working well enough that users who want large world support may be better off using it compared with float=32, even though there are still issues and limitations.

HeadClot commented 2 years ago

Bit of a question for you @aaronfranke you mentioned a "camera-relative rendering system" would this system work with VR and AR as well? Just curious.

aaronfranke commented 2 years ago

@HeadClot Yes. To be clear, the system just needs to adjust the camera's coordinates as seen by the GPU enough that they are within the limits of 32-bit floats there. Unless someone's eyes are 10km apart, it should 100% be doable for XR.

EDIT: To clarify further, this means that in games with extra cameras and viewports, it's not feasible to have two cameras on opposite sides of a planet sized object with precise rendering. AFAIK.

HeadClot commented 2 years ago

@HeadClot Yes. To be clear, the system just needs to adjust the camera's coordinates as seen by the GPU enough that they are within the limits of 32-bit floats there. Unless someone's eyes are 10km apart, it should 100% be doable for XR.

Good to hear :)

albinaask commented 2 years ago

Awesome work!! to be clear @aaronfranke , is the float=64 something that is passed as a scons argument, a project setting or a #define thing?

aaronfranke commented 2 years ago

@albinaask It's a scons argument for compiling the engine.

If you're interested in seeing how it works, this argument sets a define called REAL_T_IS_DOUBLE which you can search in the codebase for.

fire commented 2 years ago

https://github.com/godotengine/godot/issues/58516 is related.

3top1a commented 2 years ago

Just a thought - is float=16, float=128 or even float=256 possible?

Calinou commented 2 years ago

Just a thought - is float=16, float=128 or even float=256 possible?

No, only 32 and 64 are allowed. C++ does not have primitive 16-bit, 128-bit or 256-bit float types.

aaronfranke commented 2 years ago

@3top1a C++ does have a few additional types available to us, float_t, double_t, and long double. None are fixed sizes across platforms. The first two just mean "at least as big as float/double but can be bigger if the hardware prefers that", which isn't very useful to us if we're looking for a specific size.

The interesting one is long double. The type long double must be at least as big as double, but it is usually bigger. On x86 systems, this is an 80-bit type representing the underlying extended precision type, on RISC-V this uses the Q extension for 128-bit, and on ARM it would be 64-bit. Most compilers will respect this except for MSVC which just always treats it the same as double (so you need MinGW to properly use long double on Windows). This could be implemented as float=80, but it would actually be 128-bit on RISC-V (with Q, else 64-bit) and 64-bit on ARM. Even so, 80-bit isn't a huge improvement over 64-bit (you go from a few trillion meters to a few quadrillion meters), so there's not much point.

There is also __float128, which isn't available in the C++ language, but it exists in GCC. It will always be 128-bit, but on platforms without 128-bit support it will use emulation, so it would be extremely slow.

Gnollrunner commented 2 years ago

EDIT: To clarify further, this means that in games with extra cameras and viewports, it's not feasible to have two cameras on opposite sides of a planet sized object with precise rendering. AFAIK.

I think from a low level graphics API standpoint there is no reason why this too shouldn't work. I guess it depends on how everything is set up and I'm certainly not an game engine expert. The way I'm doing it you set transformations for every object on every frame. I write matrices straight to a upload buffer sequentially and just kind of scroll though them as you render your objects. This may be an issue if you have a huge number of different objects, but with say 5K meshes it hasn't been so far and I'm using a somewhat old computer and graphics card.

aaronfranke commented 2 years ago

For discussion about resolving the rendering issues with double precision, check out this issue: https://github.com/godotengine/godot/issues/58516

akien-mga commented 2 years ago

I think the bulk of this proposal has been pretty much implemented by now, though there are still issues left to resolve.

To keep a good overview over what's left to do, I would suggest opening bugs reports, and if relevant more detailed feature proposals, which can both be targeted at 4.0. WDYT?

YuriSizov commented 2 years ago

I'll close this per @akien-mga's comment above. Feel free to open improvement proposals on top of the implemented feature and submit bug reports.

godotengine / godot-proposals

Add support for double precision floats #892