PistonDevelopers / graphics

A library for 2D graphics, written in Rust, that works with multiple back-ends
MIT License
479 stars 55 forks source link

Rationale behind Scalar being f64? #967

Open theia-ajax opened 9 years ago

theia-ajax commented 9 years ago

I'm just wondering what the reasoning behind using f64 instead of f32 for all the math stuff is. In most cases the extra precision is not necessary and this really hurts cache friendliness.

bvssvni commented 9 years ago

See https://github.com/PistonDevelopers/graphics/issues/5

The extra precision does matter for a larger invariant space of I = M^-1 M.

Modern CPUs uses the same hardware register for computing f64 and f32, which leads to f32 sometimes being slower.

It would be interesting if you have a use case where f32 is faster, and also interesting if there are use cases where f64 is faster. To test this you need to run the game loop in benchmark mode, see http://blog.piston.rs/2015/05/09/benchmark-mode/

All scalars is tied up to the Scalar type alias, so you can recompile the library for f32 if you need it.

theia-ajax commented 9 years ago

The extra precision is only necessary when game worlds are quite large which I imagine would not be the general case but would be important for some games. Perhaps the math lib could be broken out into a single precision and a double precision version much like the graphics backends are broken out into their own modules?

While it's true that the cycle counts for single and double precision operations are the same the fact is that the biggest bottleneck is always going to be cache and as such I'd be willing to bet money that in any benchmark of actual game code f32 will be faster than f64 if only because you'll use less cache space.

I tried setting up my own fork that used f32 but I'm a cargo noob and had trouble actually getting the cargo libs to use it over their own dependencies.

LaylBongers commented 9 years ago

64-bit floating point values are overkill for the average game. Might it be an option to add a feature "double_precision" that can be set in cargo and defines Scalar as 64-bit, while otherwise it's 32-bit?

bvssvni commented 9 years ago

There is a word "shut up and calculate".

If you are above 50% sure f32 is faster then you should expect more than 50% of the benchmarks showing that f32 is faster. People have done benchmarks with other applications that shows that f64 is faster in some cases, so I would guess 70% without no prior probability starting with the assumption that f32 is always faster.

If the difference is in nanoseconds and the driver overhead is in 1/10 of a microsecond, then the performance gain of using f32 is 1%.

This leaves us with 0.7% expected performance improvement.

Assuming no prior knowledge of benchmarks I am preferring f64 because it has better numeric stability.

theia-ajax commented 9 years ago

In any reasonable benchmark I'm 99% certain f32 will be faster simply due to not having to throttle the cache as much. Huge speed gains are possible if the vec_math lib uses SIMD and you can effectively process twice as many f32s as f64s.

I can think of only 4 games off the top of my head which are limited by the precision of floats (Minecraft, StarCitizen, KSP, and Space Engineers) and in the case of Space Engineers all the physics calculations are broken into discrete chunks such that they can still use single precision math within those spaces and only the rendering is done with 64 bit.

It's also worth noting that while there are limitations imposed by using float both Minecraft and KSP still use float instead of double and just deal with the consequences of that creatively.

Regardless it would be cool if Piston allowed you to grab single or double precision versions of the math lib and every lib used Scalar so that it would be easy to switch between the two.

bvssvni commented 9 years ago

@tedajax The vecmath lib is generic over f32 and f64, and every lib should use Scalar, so you should be able to switch between them, if not please open up an issue. LLVM does autovectorization so there might not be that much speed gain.

bvssvni commented 9 years ago

Hmm... I am wondering if we can use a default generic. Opening https://github.com/PistonDevelopers/graphics/issues/968

theia-ajax commented 9 years ago

I had tried simply forking graphics and setting Scalar to f32 and then setting up my project to reference my fork instead and there were issues. In several other piston libs it seems people are not using Scalar and are instead using f64 directly so that's probably an issue to be addressed elsewhere.