KhronosGroup / SPIRV-Cross

SPIRV-Cross is a practical tool and library for performing reflection on SPIR-V and disassembling SPIR-V back to high level languages.
Apache License 2.0
1.98k stars 555 forks source link

Text representation of float literals is not-minimal #1599

Open baryluk opened 3 years ago

baryluk commented 3 years ago

Example:

#version 460

layout (location = 0) out vec4 fragColor;

void main()
{
    fragColor = vec4(0.4, 0.4, 0.8, 1.0);
}

-> glslang from trunk into Vulkan 1.0 SPIR-V. -> SPIRV-Cross from trunk into HLSL SM 50

static float4 fragColor;

struct SPIRV_Cross_Output
{
    float4 fragColor : SV_Target0;
};

void frag_main()
{
    fragColor = float4(0.4000000059604644775390625f, 0.4000000059604644775390625f, 0.800000011920928955078125f, 1.0f);
}

SPIRV_Cross_Output main()
{
    frag_main();
    SPIRV_Cross_Output stage_output;
    stage_output.fragColor = fragColor;
    return stage_output;
}

This is ugly, annoying, and makes debugging a bit harder.

0.4000000059604644775390625 - 0.4 (in double precision) = 5.960464455334602e-09

relative error: 5.960464455334602e-09 / 0.4 (in double precision) = 1.4901161138336505e-08

That is below what is representable in 32-bit single precision float.

The single precision number has 23 bit of mantissa, so the last bit represents 2^{-23} ≈ 1.1920928955078125e-07, which is more than the relative error of these two literals.

I believe 0.4000000059604644775390625f and 0.4f, parse to the same exact single precission number.

It looks like there is a fixed accuracy somewhere in SPIRV-Cross, maybe something like "%.25g", where for for floats, "%.8g" is probably enough for not denormalized numbers. A more smart approach is to use better algorithms, which are known around (but usually not exposed in C or C++ standard libraries).

HansKristian-Work commented 3 years ago

I would prefer a better algorithm since it is indeed ugly, but I don't know how to write one and it's not available in C++11 standard library. The fixed accuracy required for printf was quite massive FWIW in some edge cases, but ideally we'd have a smarter version that finds the minimal representation that roundtrips 1:1.

baryluk commented 3 years ago

The Ryū algorithm by Ulf Adams is usually used these days to do this. Sometimes an older grisu2 algorithm is used.

Some Ryu references:

Grisu family:

Dragon family, and history / context (the algorithm was invented in 1970 I believe but it took about 20 years to fully publish):

(Licensing and programming language varies.)

Ryu is used by default in some programming languages, virtual machines, and in some JSON serializer libraries, and there discussions to make it standard in C++, and D standard libraries. In other programming languages (like Python), I believe the is still using Dragon4.

Not only it is the faster, it produces optimal length.

One of the videos above suggest that C++17 <charconv> might be using Ryu, but I don't think this is in C++17 actually.

In fact micorosoft STL does use it now:

LLVM / clang libc++ I think does not from inspection of https://github.com/llvm/llvm-project/blob/main/libcxx/include/charconv

Similar GCC libstdc++ doesn't.

Relevant ISO C++ docs:

HansKristian-Work commented 3 years ago

Thanks, I'll consider looking into it.

alecazam commented 3 years ago

snprintf(string, sizeof(string), "%6g", floatValue); will limit the digits. I'd asked the shaderc issues group for something like this too. The current constants are far too long, and imply more precision than exists. half/float/double could even have different lengths specified (5/6/12?).

baryluk commented 3 years ago

snprintf(string, sizeof(string), "%6g", floatValue); will limit the digits. I'd asked the shaderc issues group for something like this too. The current constants are far too long, and imply more precision than exists. half/float/double could even have different lengths specified (5/6/12?).

While "%6g" for float, and "%16g" for double, often will work almost always and produce short and minimal output, they sometimes can loose precision for some specific values, so that is not really a solution.

alecazam commented 3 years ago

HexFloats are unreadable, but in this case I don't mind a loss of precision. And snprintf/vsnprintf exist on mac/win/linux. if the desire is to get even more precise, then there's stb's sprintf implementation that can be customized with special tokens to go to uber-precision. I mainly just want readable transpiled code. Android Studio had the same issue where values were printed out way too long, and it made debugging more difficult. They've since moved to %g.

chirsz-ever commented 2 years ago

I use a trick to solve this issue without modifying the SPIRV-Cross sources. Define the SPIRV_CROSS_FLT_FMT macro in the parent CMakeLists.txt which do add_subdirectory(/path/to/spirv-cross):

target_compile_definitions(spirv-cross-core PUBLIC "SPIRV_CROSS_FLT_FMT=\"\")\;\
(void)locale_radix_point\;\
void my_convert_to_string(char*, size_t, size_t, const void*)\;\
my_convert_to_string(buf, 64, sizeof(t), &t)\;\
(void)(0")

Than define the spirv_cross::my_convert_to_string function in your own project, with a good algorithm like double-conversion or ryu.