AlphaBlockDXT5::evaluatePalette not according to spec

GoogleCodeExporter commented 8 years ago

I've noticed the same issue in Unity (we use different DXT decompressor), but 
looks like nvtt also has the same problem. AlphaBlockDXT5::evaluatePalette is 
not accoding to D3D spec.

In DX SDK docs (Direct3D 9/Programming Guide/Getting Started/Direct3D 
Textures/Compressed Texture Resources/Textures with Alpha Channels), it says 
that 8-alpha block needs extra "+3" before division, and 6-alpha block needs 
extra "+2".

Doing that change fixed our decompressor; might be worth doing the same in nvtt 
as well. The error is not bad for regular textures, but gets quite bad if alpha 
is used in non-traditional ways (e.g. RGBM semi-hdr encoding, where alpha is 
scaled a lot).

Original issue reported on code.google.com by nearaz on 12 Jan 2011 at 9:53

GoogleCodeExporter commented 8 years ago

Wow, after all these years, how could I have missed that? Thanks for pointing 
it out. I wish I could take a look at the IHV specs to confirm this. For 
example, according to the public documentation the color compressor is also 
supposed to be rounded +1:

    color_2 = (2 * color_0 + color_1 + 1) / 3;
    color_3 = (color_0 + 2 * color_1 + 1) / 3;

but I'm sure that the D3D10 spec says the sum is simply truncated. I'll see if 
I can get someone at NVIDIA or Microsoft to confirm this.

Original comment by cast...@gmail.com on 12 Jan 2011 at 7:14

Changed state: Accepted
Added labels: Type-Other

GoogleCodeExporter commented 8 years ago

Yeah, the is some confusion allright about this. Doing the bias according to 
D3D spec makes decompression work as expected on D3D9 on several GPUs I've 
tried & match the REF rasterizer.

However, on OpenGL it might get hairy. S3TC extension spec _does not_ have 
those biases. At least on Mac (10.6.5), looks like NVIDIA GPUs decompress 
according to D3D9 spec, while AMD GPUs according to S3TC spec. Even if the same 
AMD GPUs decompress according to D3D9 spec when on Windows&D3D9. Fun!

Original comment by nearaz on 12 Jan 2011 at 7:18

GoogleCodeExporter commented 8 years ago

Actually, no need to bug them, here is the D3D10 documentation:

http://msdn.microsoft.com/en-us/library/bb694531(VS.85).aspx

That matches what NVTT does. However, the formula is not exact and does not 
represent exactly what the hardware does, so it's very possible that the D3D9 
formulas are closer to the actual hardware decompressor.

Original comment by cast...@gmail.com on 12 Jan 2011 at 7:21

GoogleCodeExporter commented 8 years ago

The DX10 docs do not say anything about how to actually do the decompression. 
"alpha_2 = 6/7*alpha_0 + 1/7*alpha_1; // bit code 010" would almost certainly 
not work with integer arithmetic!

Original comment by nearaz on 12 Jan 2011 at 7:28

GoogleCodeExporter commented 8 years ago

Well, I think G7x hardware implements the D3D9 formula exactly, but more recent 
nvidia hardware does neither of the two, instead implements the D3D10 formula 
within the allowed error tolerance, which is a factor that depends on the 
distance between the endpoints. 

CB and I tried taking into account the exact hardware decompressor in our 
compressors and the difference was not very significant, it only made a 
different on the single color compressor, because it tends to put the end 
points very far from each other. See the article that I wrote:
http://www.ludicon.com/castano/blog/2009/03/gpu-dxt-decompression/

I don't think there's any hardware that supports different rounding modes 
depending on the API, the texture samplers are very optimized to minimize area 
and the errors due to the different rounding modes have usually been considered 
largely irrelevant.

The multiplicative factor in RGBM might make them more significant, though...

Original comment by cast...@gmail.com on 12 Jan 2011 at 7:34

GoogleCodeExporter commented 8 years ago

OK, you are right, I'll ask for the exact code in the D3D10 spec then.

Original comment by cast...@gmail.com on 12 Jan 2011 at 7:35

GoogleCodeExporter commented 8 years ago

This issue was closed by revision r1232.

Original comment by cast...@gmail.com on 25 Feb 2011 at 9:27

Changed state: Fixed

GoogleCodeExporter commented 8 years ago

I've added support for both decoders. During compression the D3D10 reference 
decoder is always used, but in the future I'd like to add this option to the 
compression options. This is probably most useful when targeting consoles, for 
example, when compressing DXT5 textures for PS3 you can take into account that 
it implements the D3D9 decoder exactly, but that the color block supports both 
the 4 and 3 color modes. I've opened issue 160 for that feature.

Original comment by cast...@gmail.com on 25 Feb 2011 at 9:34

dmsovetov / nvidia-texture-tools

AlphaBlockDXT5::evaluatePalette not according to spec #157