floooh / oryol

A small, portable and extensible C++ 3D coding framework
MIT License
2k stars 200 forks source link

Gfx module idea dump #117

Closed floooh closed 9 years ago

floooh commented 9 years ago
floooh commented 9 years ago

Ideas got getting rid of granular uniform updates (also: to support uniform buffer / constant buffers): In shader, one or more uniform_blocks can be defined:

@uniform_block Test
@uniform mat4 mvp ModelViewProjection
@uniform mat4 world World
@uniform vec4 eyePos EyePos
@uniform vec4 bla Bla
@uniform vec3 blub Blub
@end

This would generate a C++ struct which is filled by the app:

struct TestValues {
  glm::mat4 ModelViewProjection;
  glm::mat4 World;
  glm::vec4 EyePos;
  glm::vec4 Bla;
  glm::vec3 Blub;
};
TestValues testValues;

This can be set via:

Gfx::ApplyVariable(Shaders::Main::Test, testValues);

There will also be code generated which describes the content of the structure to Gfx at shader creation time (the types and offsets of uniform block members), the 3D API wrappers use this to implement optimal uniform updating for Gfx::ApplyVariable:

Basically, the 3D API wrapper has to support a generic uniform block apply method which takes a structure layout description at creation time (part of ShaderSetup / ProgramBundleSetup), and a generic apply-method which takes a uniform slot index and pointer to struct, and which together with the layout description, implements the optimal update mechanism for the specific 3D API.

floooh commented 9 years ago

...try to eliminate dynamic per-draw-call updates as much as possible, e.g. don't set a ModelViewProj matrix per draw call, instead keep static per-frame ViewProj in separate uniform buffer from Model matrix and resolve the ModelViewProj transform in the vertex shader. If the object doesn't move, this means no per-draw-call updates at all, only flipping constant buffer handles.

Problem #1: how to handle per-draw-call updates (e.g. the remaining objects that need to move) Problem #2: how to make this efficient on 3d API without constant / uniform buffers (or: how to share Oryol uniform blocks...)

ongamex commented 9 years ago

"...flipping constant buffer handles...." - In the most cases this usage is slower than updating the currently bound cbuffer.

floooh commented 9 years ago

Hach, true, it would also mean keeping miriads of small constant buffers around. One of those ideas that sound ok in the evening and terrible in the morning, thanks for saving me from going down that dead-end ;) It is frustratingly hard to find information how GL uniform buffers and D3D11 constant buffers actually function under the hood (I think GL uniform buffers are mostly normal 'buffers', while D3D11 constant buffers have some additional magic built in).

But after sleeping over it I think that it's time to get rid of the GLES2-style "update uniforms and then draw single instance" altogether, and always update and draw instances in batches, and only use 'small' uniform blocks for per-frame and per-material data. Oryol would then need to figure out what's the best way to render these batches (either traditional 'update-uniforms and draw' on classic GLES2/WebGL, or D3D9-style-hardware-instancing, or something else on modern 3D APIs). The interesting task will be to figure out a medium-level API for the Gfx module and the shader annotation tags (there probably needs to be a new InstanceBuffer for static and dynamic per-instance data, along with annotating shader 'uniforms' as per-instance data, and some wrapper-function in the vertex shader to fetch per-instance attributes), and I guess there's now also a need to expose MapBuffer/UnmapBuffer...

ongamex commented 9 years ago

In my rendering framework I have a so called "recording context". This enables me to just call "draw call" . After that the recording context could sort the draw calls, and issue the real draw call.

Determinating which method is the best for general purpouse uniforms could be a big pain. Currently for the most cases I use a single always bound cbuffer that I update on every draw call(if needed of course). Using multiple cbuffers will be benefitial only when you got a big chunk of uniforms(like matrices for skinning) otherwise it's just not worth it for just 2-3 matrices and a 2-3 float4 to change the currently bound cbuffer. atleast this is what my humble experience is telling me.

Excuse me If I'm flooding your notes with this.

floooh commented 9 years ago

Your thoughts are most welcome :)

floooh commented 9 years ago

Closing this, most of this is either implemented or obsolete because I now have a better idea how Metal and D3D12 work.