Open nanley opened 2 years ago
Originally I used RG_32 because of their ease to reinterpret in UAVs to other formats (all formats can reinterpret to 32 bit uint) and because I was unsure if RGBA16 can be copied to BC4 in OpenGL ES. It certainly can be in DX11/12; but I was unsure for the other APIs.
Maybe I'm misunderstanding, but there shouldn't be a problematic interaction with the BC4/RG_32 destination texture. The swizzling would only impact the RGBA8 source texture.
Nope, I'm the one who misunderstood I think.
Could you post snippets of code of what/where you mean?
Sure:
diff --git a/bin/Data/bc4.glsl b/bin/Data/bc4.glsl
index 1c8cbe3..717ff82 100644
--- a/bin/Data/bc4.glsl
+++ b/bin/Data/bc4.glsl
@@ -8,10 +8,7 @@
shared float2 g_minMaxValues[4u * 4u * 4u];
shared uint2 g_mask[4u * 4u];
-layout( location = 0 ) uniform float2 params;
-
-#define p_channelIdx params.x
-#define p_useSNorm params.y
+layout( location = 0 ) uniform float p_useSNorm;
uniform sampler2D srcTex;
@@ -45,10 +42,7 @@ void main()
for( uint i = 0u; i < 4u; ++i )
{
const uint2 pixelsToLoad = pixelsToLoadBase + uint2( i, blockThreadId );
-
- const float4 value = OGRE_Load2D( srcTex, int2( pixelsToLoad ), 0 ).xyzw;
- srcPixel[i] = p_channelIdx == 0 ? value.x : ( p_channelIdx == 1 ? value.y : value.w );
- srcPixel[i] *= 255.0f;
+ srcPixel[i] = OGRE_Load2D( srcTex, int2( pixelsToLoad ), 0 ).x * 255.0f;
}
minVal = min3( srcPixel.x, srcPixel.y, srcPixel.z );
diff --git a/src/betsy/EncoderBC1.cpp b/src/betsy/EncoderBC1.cpp
index 4cdccd7..93b0f79 100644
--- a/src/betsy/EncoderBC1.cpp
+++ b/src/betsy/EncoderBC1.cpp
@@ -127,11 +127,12 @@ namespace betsy
if( m_bc4TargetRes )
{
// Compress Alpha too (using BC4)
+ glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_R, GL_ALPHA );
bindComputePso( m_bc4Pso );
bindUav( 0u, m_bc4TargetRes, PFG_RG32_UINT, ResourceAccess::Write );
- // p_channelIdx, p_useSNorm
- glUniform2f( 0, 3.0f, 0.0f );
+ // p_useSNorm
+ glUniform1f( 0, 0.0f );
glDispatchCompute( 1u, //
alignToNextMultiple( m_width, 16u ) / 16u,
Ahh!! Now I see what you mean.
I'm used to code API-agnostically and D3D11 does not have swizzling. D3D12 added it again with D3D12_SHADER_COMPONENT_MAPPING
In Vulkan swizzling means creating a different VkImageView.
@reduz what's easier for you? Creating a new VkImageView may yield slightly faster performance, while having a uniform (p_channelIdx) means no need for a second VkImageView.
The BC4 shader takes in a uniform that acts as a channel selector. With a couple ternaries, it's used to determine which channel the shader should compress. Instead of relying on this uniform, it's possible to set a texture swizzle on the texture object to remap the desired channel for compression to red/x. Then, the ternaries can be dropped. For some drivers, this optimization can save a bit of math when executing the shader.