bkaradzic / bgfx

Cross-platform, graphics API agnostic, "Bring Your Own Engine/Framework" style rendering library.
https://bkaradzic.github.io/bgfx/overview.html
BSD 2-Clause "Simplified" License
14.94k stars 1.94k forks source link

Packed 2101010 Vertex Attribute Support #459

Open siavashserver opened 9 years ago

siavashserver commented 9 years ago

Hello, does bgfx also support packed 2101010 vertex attributes? It's quite handy for storing normals and tangents and is available on D3D10/OGL3 hardware.

bkaradzic commented 9 years ago

Nope, but I can add it.

siavashserver commented 9 years ago

It will be great, thanks! :) Oh another question, what will bgfx do if it isn't supported by the target hardware? Will it decode and convert them to raw floats silently, or does it require me to decide? (for example if I want to fallback to the uint16 for every normal/tangent component)

bkaradzic commented 9 years ago

It will be handled with caps, like this: https://github.com/bkaradzic/bgfx/blob/master/include/bgfx.h#L103

You'll be responsible to handle that case.

siavashserver commented 9 years ago

Sounds great, thanks!

bkaradzic commented 9 years ago

@siavashserver I added AttribType::Uint10 https://github.com/bkaradzic/bgfx/commit/8da579ff99a62318682f1e913f5e0d9f6b8f8b2c

Can you try it?

siavashserver commented 9 years ago

@bkaradzic I'll setup a quick test, I wasn't expecting to see it baked for this summer :-D

It's surprising that D3D doesn't support the signed variant like OGL, so an extra -0.5 is necessary on shader side.

Thanks again!

bkaradzic commented 9 years ago

I didn't update docs, but you probably already know, that limitation is that Uint10 makes sense only as 3 components. A2 should be ignored due to D3D9 limitation. And less than 3 components will still take 32-bits at which point other types are better.

siavashserver commented 9 years ago

Yes, the 4th component (A2) should be ignored for portability reason. OGL3.x doesn't use same normalization formula as OGL4.x and D3D10+, and for example 0 might endup as 0.3 (or 0.6?) in vertex shader if I remember correctly.

siavashserver commented 9 years ago

Sorry for being late, I tried to modify example-01-cubes and use the new packed format to store the vertex colors as a 4 component normalized Uint10, but cubes do appear as black. (no modifications performed on shaders)

Here is the modified file: https://gist.github.com/siavashserver/6566b3099b32bde3fa16

VladSerhiienko commented 7 years ago

Currently, I'm working on the option that allows packing normals into unsigned ints (both 8 8 8 8 or 10 10 10 2 formats). I ran into the issue. When bgfx is binding vertex attributes, it uses GL_UNSIGNED_INT_10_10_10_2. It produces OpenGL error "invalid enum" on glVertexAttribPointer call:

GL_CHECK(glVertexAttribPointer(loc
        , num
        // According to docs this function accepts GL_BGRA
        // works ok but the normal components must be reordered in the shader.
        // The value 4 returned by vertexDecl.decode(...) also works ok.
        //, 1 == type ? GL_BGRA : num 
        , s_attribType[type]
        , normalized
        , _vertexDecl.m_stride
        , (void*)(uintptr_t)baseVertex)
        );

This problem can be fixed by using GL_UNSIGNED_INT_2_10_10_10_REV instead of GL_UNSIGNED_INT_10_10_10_2:

static const GLenum s_attribType[] =
{
    GL_UNSIGNED_BYTE,            // Uint8
    GL_UNSIGNED_INT_2_10_10_10_REV,  // Uint10, works with this one just fine
    //GL_UNSIGNED_INT_10_10_10_2,  // Uint10
    GL_SHORT,                    // Int16
    GL_HALF_FLOAT,           // Half
    GL_FLOAT,                    // Float
};

Also, I noticed that geometryc uses Uint8 packing, while Uint10 can provide better precision. Unless the usage of Uint8 packing was chosen on purpose, maybe that is the reason why the Uint10 did not work at the moment of implementing the tool.

My version of packing works this way (https://github.com/VladSerhiienko/FbxPipeline/blob/master/FbxPipeline/FbxPipeline/fbxpmesh.cpp):

union UIntPack_aaa2 {
    uint32_t u;
    struct {
        uint32_t x : 10;
        uint32_t y : 10;
        uint32_t z : 10;
        uint32_t w : 2;
    } q;
};

template < bool tangent >
uint32_t PackNormal_aaa2( const float* v ) {
    UIntPack_aaa2 packed;
    packed.q.x = ( uint32_t )( mathfu::Clamp( v[ 0 ], -1.0f, 1.0f ) * 511.0f + 512.0f );
    packed.q.y = ( uint32_t )( mathfu::Clamp( v[ 1 ], -1.0f, 1.0f ) * 511.0f + 512.0f );
    packed.q.z = ( uint32_t )( mathfu::Clamp( v[ 2 ], -1.0f, 1.0f ) * 511.0f + 512.0f );
    packed.q.w = tangent ? ( uint32_t )( mathfu::Clamp( v[ 3 ], -1.0f, 1.0f ) * 1.0f + 2.0f ) : 0;
    return packed.u;
}
bkaradzic commented 7 years ago

@VladSerhiienko Cool! It wasn't implemented because once I added feature didn't go back to geometryc to add it there too.

GL_UNSIGNED_INT_10_10_10_2 vs GL_UNSIGNED_INT_2_10_10_10_REV, not sure, have to check if it matches D3D.

packed.q.w = tangent ? ( uint32_t )( mathfu::Clamp( v[ 3 ], -1.0f, 1.0f ) * 1.0f + 2.0f ) : 0;

Shouldn't you do some rounding here because you only have space to represent values 0-3.

bkaradzic commented 7 years ago

@VladSerhiienko I was just looking at your FbxPipeline project, and noticed you're using vcache_optimizer, see this instead: https://github.com/zeux/meshoptimizer.

VladSerhiienko commented 7 years ago

@bkaradzic Thanks for the answers! I'll take a look at zeux`s meshoptmizer soon, looks very promising at first glance. Yeah, you're right, the w component scaling looks incorrect, thanks for that! I think it is up to a developer to check if it matches. The shader has a macro, that defines its type, so he can do some reordering in the shader if needed. It looks like that for OpenGL there are not many options.

ES 2 requires some extensions to use this format, ES 3 (https://www.khronos.org/registry/OpenGL-Refpages/es3.0/html/glVertexAttribPointer.xhtml) expects only GL_UNSIGNED_INT_2_10_10_10_REV (or signed), there is no GL_UNSIGNED_INT_10_10_10_2 format in the docs. There is a page (http://www.informit.com/articles/article.aspx?p=2033340&seqNum=3) that has nice imgs of format layout.

GL_UNSIGNED_INT_2_10_10_10_REV

Looks like GL_UNSIGNED_INT_2_10_10_10_REV (without GL_BGRA) will decode the uint as expected + I can see correct results with that function in FbxPipeline.

attilaz commented 6 years ago

@bkaradzic I checked and A2 seems to be supported on GLES3, Metal, Vulkan, D3D11 and D3D12. So I would say on most platforms.

I would like to use it to store tangent+(handedness in w) vectors in geometry file. My assumption is that 0.0 and 1.0 is coded perfectly even on 2 bits. It would be higher precision for the same number of bits. If A2 isn't supported runtime, one can convert it to Uint8.