Closed ruihe774 closed 6 months ago
Note: We don't currently have a conceivable use case for half float vertex data. It may make sense to reduce the check to just color renderable half float textures (and drop the corresponding V capability), which will also make it a bit less redundant with the existing formats_float
.
Looks good either way.
Note: We don't currently have a conceivable use case for half float vertex data. It may make sense to reduce the check to just color renderable half float textures (and drop the corresponding V capability), which will also make it a bit less redundant with the existing formats_float.
PL_FMT_CAP_VERTEX
is compared in cmp_fmt
. Formats without this capacity will not take precedence.
Note: We don't currently have a conceivable use case for half float vertex data. It may make sense to reduce the check to just color renderable half float textures (and drop the corresponding V capability), which will also make it a bit less redundant with the existing formats_float.
PL_FMT_CAP_VERTEX
is compared incmp_fmt
. Formats without this capacity will not take precedence.
Oh, I see. That sounds like unintended behavior. Give me a moment to prepare a patch
It seems that NaNs do not survive an upload/download round-trip on my end. They get consistently written back as 0xFE00.
We could probably modify our test suite to not generate these NaNs in the first place, but given the fact that it works for "native" formats like 32-bit floats, as well as 16-bit floats on Vulkan, I rather suspect that the issue here is that the 16hf code path on my GL driver is actually emulated internally.
Should we mark them as emulated?
testing texture roundtrip for format rgba16hf
... 1D
upload time: 2760, download time: 2200
... 2D
upload time: 3000, download time: 2760
... 3D
=== FAILED: memcmp(src, dst, bytes) == 0 at ../src/tests/gpu_tests.h:201
at position 160: 0x0d != 0x00
416 bytes of 'src' at offset 0:
c6 16 21 4c ac b9 32 a8 c5 59 c8 e4 7d f5 12 8c
68 82 3f 33 8e 0f 5a b6 46 38 ae 2a 6c a5 c1 33
bc e3 7f 68 9c b2 10 62 0b d9 46 89 ce 59 15 37
dc 55 6a 6a 65 c4 21 ab fc cf d6 69 74 97 9c 31
7a 1b 9a 17 ce ab 79 d9 85 c0 63 54 1a 79 8b f6
ce f5 60 34 b9 81 df b6 51 b5 1f c6 4d bb f7 c8
d7 92 df a5 3d 59 7e c3 1a e2 17 34 5b a2 2b 2a
98 8c 5e 51 0e 3e 08 5f f4 27 25 41 e2 1d 0a b9
af e9 5e ed 43 dd b0 5e bf c8 92 1b 6b be 45 03
4a a4 55 58 e2 5d b7 d7 84 dc 18 67 fa 23 20 a9
0d 7f 97 50 5d 48 ae 1d 10 41 38 7b ff 7e 7e 49
^^^^^ ^^^^^
22 d3 a1 05 31 58 dc b5 35 f5 1d 2f 18 3e d9 25
bd 70 76 1b b8 24 38 c9 66 71 44 65 f0 c2 ae 13
96 50 18 c7 a9 f5 7d df ea 9a 0f 02 d8 e8 28 96
59 9e b2 12 c3 eb db 29 5d 1f 8e 4d e2 3d 60 79
8e 79 40 37 6e be 17 58 59 26 5b 32 0e 84 c9 68
22 7b 7a e5 66 55 0f c3 74 9d 10 56 db 71 d0 6a
ea 10 a1 58 cf b9 b1 28 df 0d 5b ee 91 24 56 b3
9f d0 99 05 25 a9 c9 9a 47 da f1 22 4c c1 8c 36
d2 2e 8f a1 e7 41 ca c6 4e 25 b4 df 49 0b 93 e9
db 2d ee 00 d6 b8 9a 1d 92 8c 40 df 4d cd 15 1f
fb a5 c0 e3 e6 8b aa 35 b0 5f 14 fa 6a a8 e3 45
d6 d2 46 ac 8b e1 ca 1e 6d 0a fd ba d7 12 da d3
b8 9a b7 9e 26 61 d3 d7 c0 e8 d1 2b 91 b5 71 67
88 b7 13 13 99 de 31 06 e8 2e c1 c0 41 9b 93 f9
36 4a 98 5d ac 6c 34 6d 54 06 98 e6 bb 09 4d 44
416 bytes of 'dst' at offset 0:
c6 16 21 4c ac b9 32 a8 c5 59 c8 e4 7d f5 12 8c
68 82 3f 33 8e 0f 5a b6 46 38 ae 2a 6c a5 c1 33
bc e3 7f 68 9c b2 10 62 0b d9 46 89 ce 59 15 37
dc 55 6a 6a 65 c4 21 ab fc cf d6 69 74 97 9c 31
7a 1b 9a 17 ce ab 79 d9 85 c0 63 54 1a 79 8b f6
ce f5 60 34 b9 81 df b6 51 b5 1f c6 4d bb f7 c8
d7 92 df a5 3d 59 7e c3 1a e2 17 34 5b a2 2b 2a
98 8c 5e 51 0e 3e 08 5f f4 27 25 41 e2 1d 0a b9
af e9 5e ed 43 dd b0 5e bf c8 92 1b 6b be 45 03
4a a4 55 58 e2 5d b7 d7 84 dc 18 67 fa 23 20 a9
00 fe 97 50 5d 48 ae 1d 10 41 38 7b 00 fe 7e 49
^^^^^ ^^^^^
22 d3 a1 05 31 58 dc b5 35 f5 1d 2f 18 3e d9 25
bd 70 76 1b b8 24 38 c9 66 71 44 65 f0 c2 ae 13
96 50 18 c7 a9 f5 7d df ea 9a 0f 02 d8 e8 28 96
59 9e b2 12 c3 eb db 29 5d 1f 8e 4d e2 3d 60 79
8e 79 40 37 6e be 17 58 59 26 5b 32 0e 84 c9 68
22 7b 7a e5 66 55 0f c3 74 9d 10 56 db 71 d0 6a
ea 10 a1 58 cf b9 b1 28 df 0d 5b ee 91 24 56 b3
9f d0 99 05 25 a9 c9 9a 47 da f1 22 4c c1 8c 36
d2 2e 8f a1 e7 41 ca c6 4e 25 b4 df 49 0b 93 e9
db 2d ee 00 d6 b8 9a 1d 92 8c 40 df 4d cd 15 1f
fb a5 c0 e3 e6 8b aa 35 b0 5f 14 fa 6a a8 e3 45
d6 d2 46 ac 8b e1 ca 1e 6d 0a fd ba d7 12 da d3
b8 9a b7 9e 26 61 d3 d7 c0 e8 d1 2b 91 b5 71 67
88 b7 13 13 99 de 31 06 e8 2e c1 c0 41 9b 93 f9
36 4a 98 5d ac 6c 34 6d 54 06 98 e6 bb 09 4d 44
Full log: testlog.txt
It seems that NaNs do not survive an upload/download round-trip on my end. They get consistently written back as 0xFE00.
It is implementation-dependent. e.g. iris fails the test while swrast passes it.
According to OpenGL spec (ver 4.6, sec 2.3.4), passing NaN to GL commands is an UB, despite 16hf or 32f:
Any representable floating-point value is legal as input to a GL command that requires floating-point data. The result of providing a value that is not a floating-point number to such a command is unspecified, but must not lead to GL interruption or termination. In IEEE arithmetic, for example, providing a negative zero or a denormalized number to a GL command yields predictable results, while providing a NaN or an infinity yields unspecified results.
IMO we should not assume the result of an UB in tests.
Okay, then we have no choice but to make our test framework smarter when generating/testing floating point values. I'll write a patch for it.
BTW it is also an UB in Vulkan.
https://registry.khronos.org/vulkan/specs/1.3/html/chap3.html#fundamentals-general
Rebased + merged this manually, thanks
Fixes #263.