floooh / sokol

minimal cross-platform standalone C headers
https://floooh.github.io/sokol-html5
zlib License
6.53k stars 467 forks source link

[sokol_gfx] Any plans to support base vertex location? #1033

Closed jakubtomsu closed 2 months ago

jakubtomsu commented 2 months ago

When using a single big vertex+index buffer to store multiple meshes, it would be useful if sg.draw also had a base vertex location parameter. This way it wouldn't be necessary to pre-process the index buffers and add the offsets manually. So essentially the GPU would load vertex from index_buffer[i] + vertex_offset.

This feature is supported in D3D11 and OpenGL 3.2+, however I'm not sure if all other APIs sokol targets support it. Maybe each backend could report an error if the base_vertex parameter is used and not supported. There could also be a way to query the support.

Another issue is that adding a new parameter to sg_draw would break existing code, which isn't very good considering how important this function is...

jakubtomsu commented 2 months ago

I just noticed vertex_buffer_offsets from Bindings struct, I guess that does something similar? It would still be nice to have that info on a per-draw basis, but this could also be useful...

floooh commented 2 months ago

Yeah, the vertex buffer offset in sg_bindings is supposed to be the workaround for the missing base vertex (and base instance). Main reason is that WebGL2 and GLES up to 3.1 don't have the required functionality.

jakubtomsu commented 2 months ago

Right, that makes sense. But the main reason why base_vertex on a draw call makes sense in the first place is to make sure I don't have to apply_bindings every time, but only once and then draw all the meshes...

Anyway, I guess it's not gonna be supported anytime soon because of those older APIs. So I'm closing this for now

floooh commented 2 months ago

Yeah, having to 'rebind' is definitely a downside (even though internally there's usually a filter which skips redundant resource bindings, for instance on Metal only the buffer offset is updated, not the entire buffer binding when only the vertex_buffer_offset changes) https://github.com/floooh/sokol/blob/c2bb83f0b35e09d97a354b5f4cf4c3df783c4193/sokol_gfx.h#L12718).

jakubtomsu commented 2 months ago

Right, that's cool. However I'm mostly concerned with D3D11, which seems to bind everything all the time at the moment... (unless the runtime/driver can skip it?)

I think I'll just do apply_bindings every time for now, the renderer uses instancing as much as possible so the number of draw calls should be low anyway. If I run into perf issues I could just write the hot inner loop with d3d11 directly by bypassing sokol

floooh commented 2 months ago

However I'm mostly concerned with D3D11

...yeah I didn't specifically benchmark yet how much filtering redundant bindings would actually help in D3D11, but I really don't think it's a performance issue. First, D3D11 already uses 'batched' bindings calls (e.g. all bindings of one type and shader stage are applied with a single D3D11 call), and second, IME D3D11 is really well optimized internally, so if it helps I would expect that they do the redundant-bindings-filtering internally.

floooh commented 2 months ago

PS: what might make sense in D3D11 though is skipping the D3D11 binding calls by resource types... e.g. if you call sg_apply_bindings() with only the vertex offsets changed, don't do the other binding calls (IASetIndexBuffer, VSSetShaderResources, PSSetShaderResources, VSSetSamplers, PSSetSamplers).

...e.g. if you find that writing the hot inner loop directly in D3D11 is significantly faster because of those redundant bind calls, just holler and we can figure something out.

jakubtomsu commented 2 months ago

Yep I'll let you know. Also I just found out the vertex_buffer_offsets don't seem to work with indexed meshes (at least on D3D11), which is unfortunate...

floooh commented 2 months ago

Also I just found out the vertex_buffer_offsets don't seem to work with indexed meshes (at least on D3D11), which is unfortunate...

Hmm, it should. This sample is using vertex_buffer_offsets and indices:

https://github.com/floooh/sokol-samples/blob/master/sapp/bufferoffsets-sapp.c

...as well as this:

https://github.com/floooh/sokol-samples/blob/master/sapp/noninterleaved-sapp.c

Also the sokol_imgui.h header is using both vertex_buffer_offsets and index_buffer_offset:

https://github.com/floooh/sokol/blob/c2bb83f0b35e09d97a354b5f4cf4c3df783c4193/util/sokol_imgui.h#L2682-L2684

Your index ranges for a 'submesh' need to be zero-based. E.g. an index 0 points to the first vertex at the vertex_buffer_offset, not to the first vertex in the vertex buffer. That's about the only thing I can think of which might be go wrong.

Of course if you just want to render different index ranges of the same mesh, and all indices in the vertex buffer are "absolute" (e.g. index 0 always points to the first vertex in the vertex buffer) you can just offset the draw call with the base_element parameter of sg_draw(), and you don't need to mess with vertex_buffer_offsets or index_buffer_offset.

floooh commented 2 months ago

PS:

So essentially the GPU would load vertex from index_buffer[i] + vertex_offset.

That's what's happening, except that the buffer offsets are counted in bytes, not in 'number of vertices'. Maybe that's the problem?

E.g. note how I'm multiplying the offsets here by the size of a vertex and size of an index:

https://github.com/floooh/sokol-samples/blob/2a74ef5905fe7841eaae6a70248d45ebed275fcc/sapp/bufferoffsets-sapp.c#L87-L88

...note that I could also just keep the index_buffer_offset at 0, and instead change the second draw call like this:

sg_draw(3, 6, 1);
jakubtomsu commented 2 months ago

Right, I understand the offset is in bytes and the vertex indices are zero based, so it looks like I was just doing something wrong. Thanks for the examples!

floooh commented 2 months ago

If you still discover a bug with the buffer offsets, let me know!