Text representation of float literals is not-minimal

baryluk commented 3 years ago

Example:

#version 460

layout (location = 0) out vec4 fragColor;

void main()
{
    fragColor = vec4(0.4, 0.4, 0.8, 1.0);
}

-> glslang from trunk into Vulkan 1.0 SPIR-V. -> SPIRV-Cross from trunk into HLSL SM 50

static float4 fragColor;

struct SPIRV_Cross_Output
{
    float4 fragColor : SV_Target0;
};

void frag_main()
{
    fragColor = float4(0.4000000059604644775390625f, 0.4000000059604644775390625f, 0.800000011920928955078125f, 1.0f);
}

SPIRV_Cross_Output main()
{
    frag_main();
    SPIRV_Cross_Output stage_output;
    stage_output.fragColor = fragColor;
    return stage_output;
}

This is ugly, annoying, and makes debugging a bit harder.

0.4000000059604644775390625 - 0.4 (in double precision) = 5.960464455334602e-09

relative error: 5.960464455334602e-09 / 0.4 (in double precision) = 1.4901161138336505e-08

That is below what is representable in 32-bit single precision float.

The single precision number has 23 bit of mantissa, so the last bit represents 2^{-23} ≈ 1.1920928955078125e-07, which is more than the relative error of these two literals.

I believe 0.4000000059604644775390625f and 0.4f, parse to the same exact single precission number.

It looks like there is a fixed accuracy somewhere in SPIRV-Cross, maybe something like "%.25g", where for for floats, "%.8g" is probably enough for not denormalized numbers. A more smart approach is to use better algorithms, which are known around (but usually not exposed in C or C++ standard libraries).

HansKristian-Work commented 3 years ago

I would prefer a better algorithm since it is indeed ugly, but I don't know how to write one and it's not available in C++11 standard library. The fixed accuracy required for printf was quite massive FWIW in some edge cases, but ideally we'd have a smarter version that finds the minimal representation that roundtrips 1:1.

baryluk commented 3 years ago

The Ryū algorithm by Ulf Adams is usually used these days to do this. Sometimes an older grisu2 algorithm is used.

Some Ryu references:

https://github.com/ulfjack/ryu
https://github.com/expnkx/fast_io
Ulf Adams. 2018. Ryū: fast float-to-string conversion. SIGPLAN Not. 53, 4 (April 2018), 270–282. DOI:https://doi.org/10.1145/3296979.3192369
- https://dl.acm.org/ft_gateway.cfm?id=3192369&type=pdf
Ulf Adams. 2019. Ryū revisited: printf floating point conversion. Proc. ACM Program. Lang. 3, OOPSLA, Article 169 (October 2019), 23 pages. DOI:https://doi.org/10.1145/3360595
https://doi.org/10.5281/zenodo.3366212
https://github.com/Alexhuszagh/minimal-lexical
https://github.com/Alexhuszagh/rust-lexical
https://github.com/python/cpython/blob/00d7abd7ef588fc4ff0571c8579ab4aba8ada1c0/Python/pystrtod.c#L796
https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Python/dtoa.c
https://github.com/night-shift/fpconv
https://github.com/UplinkCoder/fpconv/blob/master/src/fpconv_ctfe.d
https://github.com/libmir/mir-algorithm/blob/ac241850d3f7375220db9eeed3e8b4185d1f896d/source/mir/bignum/internal/ryu/generic_128.d
https://pldi18.sigplan.org/event/pldi-2018-papers-ry-fast-float-to-string-conversion
https://www.youtube.com/watch?v=kw-U6smcLzk "Ryū: Fast Float-to-String Conversion" by Ulf Adams at PLDI 2018, ACM
https://www.youtube.com/watch?v=4P_kbF0EbZM "Stephan T. Lavavej “Floating-Point ＜charconv＞: Making Your Code 10x Faster With C++17's Final Boss”, CppCon 2019
http://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf Printing Floating-Point Numbers Quickly and Accurately with Integers by Loitsch in 2010
https://github.com/google/double-conversion

Grisu family:

"Printing Floating-Point Numbers Quickly and Accurately with Integers" by Florian Loitsch, PLDI'10 http://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf

Dragon family, and history / context (the algorithm was invented in 1970 I believe but it took about 20 years to fully publish):

"How to print floating-point numbers accurately" by Guy L. Steele Jr. and Jon L White, Proceedings of the ACM SIGPLAN'90 Conference on Programming Language Design and Implementation; http://www.kurtstephens.com/files/p372-steele.pdf (starting at page 4 of the PDF)
https://ampl.com/REFS/rounding.pdf , "Correctly Rounded Binary-Decimal and Decimal-BinaryConversions" , David M. Gay

(Licensing and programming language varies.)

Ryu is used by default in some programming languages, virtual machines, and in some JSON serializer libraries, and there discussions to make it standard in C++, and D standard libraries. In other programming languages (like Python), I believe the is still using Dragon4.

Not only it is the faster, it produces optimal length.

One of the videos above suggest that C++17 <charconv> might be using Ryu, but I don't think this is in C++17 actually.

In fact micorosoft STL does use it now:

https://github.com/microsoft/STL/blob/a6f285db8a48bd7ec58647cfc3f4725f2d766715/stl/inc/xcharconv_ryu.h

LLVM / clang libc++ I think does not from inspection of https://github.com/llvm/llvm-project/blob/main/libcxx/include/charconv

Similar GCC libstdc++ doesn't.

Relevant ISO C++ docs:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0067r5.html
- Quote: "For floating-point numbers, there should be a facility to output a floating-point number with a minimum number of decimal digits where input from the digits is guaranteed to reproduce the original floating-point value. "

HansKristian-Work commented 3 years ago

Thanks, I'll consider looking into it.

alecazam commented 3 years ago

snprintf(string, sizeof(string), "%6g", floatValue); will limit the digits. I'd asked the shaderc issues group for something like this too. The current constants are far too long, and imply more precision than exists. half/float/double could even have different lengths specified (5/6/12?).

baryluk commented 3 years ago

snprintf(string, sizeof(string), "%6g", floatValue); will limit the digits. I'd asked the shaderc issues group for something like this too. The current constants are far too long, and imply more precision than exists. half/float/double could even have different lengths specified (5/6/12?).

While "%6g" for float, and "%16g" for double, often will work almost always and produce short and minimal output, they sometimes can loose precision for some specific values, so that is not really a solution.

alecazam commented 3 years ago

HexFloats are unreadable, but in this case I don't mind a loss of precision. And snprintf/vsnprintf exist on mac/win/linux. if the desire is to get even more precise, then there's stb's sprintf implementation that can be customized with special tokens to go to uber-precision. I mainly just want readable transpiled code. Android Studio had the same issue where values were printed out way too long, and it made debugging more difficult. They've since moved to %g.

chirsz-ever commented 2 years ago

I use a trick to solve this issue without modifying the SPIRV-Cross sources. Define the SPIRV_CROSS_FLT_FMT macro in the parent CMakeLists.txt which do add_subdirectory(/path/to/spirv-cross):

target_compile_definitions(spirv-cross-core PUBLIC "SPIRV_CROSS_FLT_FMT=\"\")\;\
(void)locale_radix_point\;\
void my_convert_to_string(char*, size_t, size_t, const void*)\;\
my_convert_to_string(buf, 64, sizeof(t), &t)\;\
(void)(0")

Than define the spirv_cross::my_convert_to_string function in your own project, with a good algorithm like double-conversion or ryu.

KhronosGroup / SPIRV-Cross

Text representation of float literals is not-minimal #1599