Closed GoogleCodeExporter closed 8 years ago
Wow, after all these years, how could I have missed that? Thanks for pointing
it out. I wish I could take a look at the IHV specs to confirm this. For
example, according to the public documentation the color compressor is also
supposed to be rounded +1:
color_2 = (2 * color_0 + color_1 + 1) / 3;
color_3 = (color_0 + 2 * color_1 + 1) / 3;
but I'm sure that the D3D10 spec says the sum is simply truncated. I'll see if
I can get someone at NVIDIA or Microsoft to confirm this.
Original comment by cast...@gmail.com
on 12 Jan 2011 at 7:14
Yeah, the is some confusion allright about this. Doing the bias according to
D3D spec makes decompression work as expected on D3D9 on several GPUs I've
tried & match the REF rasterizer.
However, on OpenGL it might get hairy. S3TC extension spec _does not_ have
those biases. At least on Mac (10.6.5), looks like NVIDIA GPUs decompress
according to D3D9 spec, while AMD GPUs according to S3TC spec. Even if the same
AMD GPUs decompress according to D3D9 spec when on Windows&D3D9. Fun!
Original comment by nearaz
on 12 Jan 2011 at 7:18
Actually, no need to bug them, here is the D3D10 documentation:
http://msdn.microsoft.com/en-us/library/bb694531(VS.85).aspx
That matches what NVTT does. However, the formula is not exact and does not
represent exactly what the hardware does, so it's very possible that the D3D9
formulas are closer to the actual hardware decompressor.
Original comment by cast...@gmail.com
on 12 Jan 2011 at 7:21
The DX10 docs do not say anything about how to actually do the decompression.
"alpha_2 = 6/7*alpha_0 + 1/7*alpha_1; // bit code 010" would almost certainly
not work with integer arithmetic!
Original comment by nearaz
on 12 Jan 2011 at 7:28
Well, I think G7x hardware implements the D3D9 formula exactly, but more recent
nvidia hardware does neither of the two, instead implements the D3D10 formula
within the allowed error tolerance, which is a factor that depends on the
distance between the endpoints.
CB and I tried taking into account the exact hardware decompressor in our
compressors and the difference was not very significant, it only made a
different on the single color compressor, because it tends to put the end
points very far from each other. See the article that I wrote:
http://www.ludicon.com/castano/blog/2009/03/gpu-dxt-decompression/
I don't think there's any hardware that supports different rounding modes
depending on the API, the texture samplers are very optimized to minimize area
and the errors due to the different rounding modes have usually been considered
largely irrelevant.
The multiplicative factor in RGBM might make them more significant, though...
Original comment by cast...@gmail.com
on 12 Jan 2011 at 7:34
OK, you are right, I'll ask for the exact code in the D3D10 spec then.
Original comment by cast...@gmail.com
on 12 Jan 2011 at 7:35
This issue was closed by revision r1232.
Original comment by cast...@gmail.com
on 25 Feb 2011 at 9:27
I've added support for both decoders. During compression the D3D10 reference
decoder is always used, but in the future I'd like to add this option to the
compression options. This is probably most useful when targeting consoles, for
example, when compressing DXT5 textures for PS3 you can take into account that
it implements the D3D9 decoder exactly, but that the color block supports both
the 4 and 3 color modes. I've opened issue 160 for that feature.
Original comment by cast...@gmail.com
on 25 Feb 2011 at 9:34
Original issue reported on code.google.com by
nearaz
on 12 Jan 2011 at 9:53