Open recatek opened 3 years ago
this would probably need glam to support avx instructions for f64 types
When downcasting to f32
for rendering, you will need to have some kind of floating origin, to keep the camera away from floating point errors.
I solved this problem by creating my own Transform64
type, and giving entities both the f32
and f64
variants as components. I added a RenderingOrigin
resource, and every frame there is a system to sync the Transform
to TransformF64 - RenderingOrigin
. I don't see why you need the complexity of gameplay entities and render entities, and syncing between them. I'm not worried about cache misses or having iterate over all entities like this - it should be a pretty cheap operation in the grand scheme of things.
Oh, right, you can just have both position representations on the same entity. That is simple and clever, good point!
As part of this, we'd want f64
and fixed point variants of physics (and related systems), to allow for their seamless use in games that need these transform types. Probably toggle this by using a State
as part of the default plugins; even though we'll always need to cast to f32
for rendering.
Relevant to #1678.
I completely agree with doing this, lol. Godot is currently trying to do the same thing, so the earlier we do this, the less work we have to do in the long run. I know rapier already does something like this also, so that might be something to look towards
This is extremely important for deterministic simulations. You can certainly work around it by having other types on a component and syncing the transform, but that's not ideal.
Might be worth looking into how nalgebra / nphysics is handling this
https://www.rustsim.org/blog/2020/06/01/this-month-in-rustsim/
I'd like to chime in support for this.
The recent rendering rework seems to adopt separate game/rendering entities concept and should help with the rendering issues. We just have to transform the game world locations relative to some floating "RenderingOrigin
" transform when generating the render world transforms....somewhere around...here though know that my understanding of the code base and rework is minimal.
Edit: with rendering done in a completely separate context, I think making the game world default to f64 should be considered.
Edit: with rendering done in a completely separate context, I think making the game world default to f64 should be considered.
This should definitely be benchmarked if we ever decide to do so. Losing widespread support for SSE2 on f32s in both software and hardware, lower cache coherency due to the larger data types, and limited/no support for in both glam and hardware support AVX for f64s may have serious performance implications.
I'd also like to note my support for this change and my willingness to help with work on this.
I have grand plans to use Bevy and I've been meaning to get into helping with development while simultaneously building using the engine.
I am also interested in 64-bit Transforms! Currently experimenting with a Minecraft server using Bevy, and the math puts f32
just barely out of range to support an entire 60,000,000 x 60,000,000
map. For now I'll roll my own Transform
, but I am interested in helping push it into standard Bevy.
Any news on this ? How would I practically switch my game to f64 coordinates, transforms etc. ?
@LucCADORET I wouldn't hold my breath for something to be done about this soon, so here's what worked for me:
I have a very large world and one main actor that the camera is focused on at all times. As you would expect, moving too far away from the physics origin leads to precision loss and undesired behavior. So what I did was to introduce a floating physics frame origin within which all physics calculations happen. So the "global" position is always physics_frame_origin_position + actor_position_in_physics_frame
. Once the actor is too far from the physics origin, the floating origin is moved to the current actor coordinate and all entities in the physics frame are moved accordingly. This does of course not help if you want to calculate physics far away from such an actor, but I don't have to and maybe you can work around that as well.
Next step in this space is likely to be #4379, but further work around transform types will be a slow-and-steady design conversation :)
This is extremely important for deterministic simulations.
Actually, my understanding is that f64
support is orthogonal to deterministic simulation. Non-determinism on a given machine is due to use of undefined behaviors and other RNGs, or some patterns (e.g. sorting by pointer value), all of which can be fixed without switching away from f32
. And non-determinism across platforms/CPUs doesn't come from floating point imprecision, it comes from specific instructions (in SIMD notably) implemented differently on different CPUs, which switching to f64
will do nothing to help because the 64-bit version of those instructions also causes non-determinism across CPUs.
As for fixed point, this is the same story, and only helps because non-deterministic instructions are generally floating point ones, which fixed-point calculations don't use. But you can easily make a fixed-point simulation non-deterministic if you're not careful. The statement "we switched to fixed point so we're deterministic" is likely wrong. At best it's rather "we switched to fixed point and the source of non-determinism we had identified is gone".
That being said, switching to f64
is a very valid design decision in 2022 which much less concern for performance than a decade or two ago. But I don't think from my experience the determinism argument applies.
Just to clarify, the ability to better control the underlying type of a transform here would serve multiple discrete goals. Using f64
would be useful for large worlds that f32
precision bounds may struggle with (without requiring something like, say, a hierarchical coordinate system). Using some form of fixed point type would allow for easier-to-control cross-platform determinism. These are indeed orthogonal goals, but both could be made more accessible by adding controls over the fundamental Transform type.
That said, I don't think f64
is a good default over f32
-- f32
still is the most space- and SIMD-efficient choice for the most games. I think it only makes sense when your world scale requires going larger.
a possible direction for double precision translations (wrt rendering): https://godotengine.org/article/emulating-double-precision-gpu-render-large-worlds/
Would love to see this, I always seem to end up in projects hitting this issue for one reason or another, mostly it's trying to make large maps multiplayer that just ends up making multiple other aspects extra complicated.
Note that for many applications, using a floating origin can be a very viable alternative, and be both faster and capable of representing much larger differences.
https://github.com/aevyrie/big_space is well-maintained and has impressive showcase results: perhaps consider that as an alternative?
Some prior art context (since it hadn't already been mentioned)
As of Unreal 5.0 most parts of the engine have moved from 32 to 64 bit floats for positions, and as of 5.1 the default max world extents were upgraded from 22km to 88m km.
Their naming for this feature is large world coordinates (LWC), the linked docs have more information about how LWC interacts with things like shaders and particles which couldn't be fully transitioned.
One potential way forward on this is to open the space up to allow people to slot in their own Transform
types, so long as the final output is mappable back to GlobalTransform
, which may be doable given options like big_space. This seems like the most feasible way forward for custom transform formats or non-euclidean spaces.
This would let you define your own transform type (i.e. with f64s, fixed point values, spherical geometry, or hyperbolic geometry) and so long as the final result is mappable back to a glam::Affine3A
, the rest of the engine should[^1] still work as expected.
The primary thing needed here is a way to do generic property propagation down the hierarchy that operates on a trait. Like seen in #5673 and related PRs.
[^1]: This assumes that the component in question implements Reflect
, we have a form "animate anything", and have a way for physics plugins to interface with these new transform types.
Regarding hyperbolic or spherical geometry, the most natural way to represent them in 3D games requires all 16 entries of the transformation matrix to be manipulable, so the final output would not be mappable back to a GlobalTransform
because it's not affine.
One potential way forward on this is to open the space up to allow people to slot in their own
Transform
types, so long as the final output is mappable back toGlobalTransform
. ↩
That would be great and might allow other cases besides f64. I was looking for the opposite - a way to use f16 or other small types (with dynamic origin or Signed Distance Function). I guess it's possible to use a custom function and set the Transform each frame, but would be nice if there was a simple way to enhance and re-use the existing Transform/PBR bundles.
FYI, that mapping from many different transform types back to GlobalTransform
is something that @NthTensor and I have been eyeing to better support UI and 2D transform types in Bevy. Support for f64, f16 and more is a nice side benefit.
Yes, hello! We do want to make it easier to use custom transform representations of all sorts, subject to two constraints:
glam
, and no chance we will switch away from glam
.GlobalTransform
) has to be 32-bit. I don't think we have the same level of flexibility as UE5 here.We have a bit of a vague plan, and I'm open to any specific proposals or designs that can operate in that constraint space.
@alice-i-cecile are those mappings at all related to interpolation, eg for lower physics tick rate but objects still move smoothly at higher render rates.
Not inherently, but splitting apart gameplay transforms and render transforms is definitely a possible outcome. We should form a working group for this in the 0.15 cycle. See #1259 for the main thread on interpolation in Bevy.
What problem does this solve or what need does it fill?
For very large game worlds, or for multiplayer games, it can be difficult to work within the floating point confines of f32. Giving users control over the native floating- or fixed-point type that bevy uses for rendering and other position-driven logic would enable for a wider range of potential games and simulations without having to hack through bevy internal systems.
What solution would you like?
Expose a compile-time configuration option that controls the inherent type that bevy uses for its position and transform components in rendering and other related (e.g. audio) systems. In the case of a custom fixed-point solution, this would require a conversion method on the user's part. In the case of f64, in a perfect world, this would use 64-sized SIMD intrinsics wherever 32-sized intrinsics are used for f32. For rendering, at some point in the pipeline, a conversion to f32 is almost inevitable, but objects far enough away from the camera would likely be culled to where precision loss would be irrelevant.
What alternative(s) have you considered?
1) Separating gameplay entities from rendering entities. Gameplay entities are given a homemade position component using f64/fixed-point values in world space, while rendering entities use f32 in a transformed camera-relative space. This is undesirable because it necessitates a large-scale copy of data from gameplay to rendering entities each frame. On top of that, this copy process is rife with cache misses, as the decision of which entities are close enough to the camera to bother transforming creates an effectively random set of gameplay/render entity links, and each lookup for this process is then a random access.
2) Not using bevy's transform components at all and rendering manually in batches. This avoids the copy and entity-linking process, but requires a lot of manual work on the user's part to pass over each gameplay entity, determine if it's worth rendering, convert its coordinates into camera-relative space, bin it into proper instances, and send it out for rendering. This essentially recreates, or requires forks of, entire crates like bevy_sprite and such in the process.
Additional context
Not sure how much it matters, but regarding SIMD instructions, SSE up to 4.2 has an adoption rate of at least 98% on the Steam hardware survey, and AVX is just shy of 95%. This covers the bulk of support for f64-sized SIMD intrinsics as far as I know.