slime73 / love-experiments

Experimental and work-in-progress changes for https://github.com/love2d/love
Other
11 stars 0 forks source link

Modern Graphics API Thoughts #1

Closed bjornbytes closed 2 years ago

bjornbytes commented 5 years ago

High level notes/findings on what a modern LÖVEly graphics API could look like.

bjornbytes commented 5 years ago

I'm looking at Vulkan and WebGPU as APIs to target. Both of these have some features in common:

Inside a command buffer, you can "enter" a render pass, bind a pipeline, bind buffers, and then submit draws.

Right now I'm most interested in the pipeline objects, and I'm still a bit confused about render pass objects (gonna talk about those later).

Pipelines store a lot of state, approximately:

In an ideal world, you're supposed to create all the pipelines you want to use upfront and then switch between them at runtime. For LÖVE/LÖVR, we have a highly scriptable immediate-mode API right now, so it isn't really feasible to have lovers specify all of the rendering details upfront. It seems like most applications in this situation do "last-minute" pipeline resolves at draw time, with plenty of hashing and caching to keep things fast (this blog post outlines this a bit). I'm aiming to use this technique, at least at first to keep things flexible.

I'm planning on making the following API changes:

Sorry if this is all still a bit disorganized, still learning and organizing thoughts!

bjornbytes commented 5 years ago

Threading: Vulkan drivers aren't threaded like OpenGL drivers are, leaving it up to the application. I can think of two different ways of taking better advantage of multithreaded rendering:

slime73 commented 5 years ago

Here are some of my own thoughts / where my head is at right now. A lot of it matches up well with your notes, I think.

bjornbytes commented 5 years ago

I'm slowly starting to come around to the "render pass" object idea. I was originally uneasy because it doesn't map directly onto one of the GPU concepts, but I realized that it's a really approachable/convenient way to structure a game.

The thing that helped it click for me was thinking about them as "layers". Like how in GDC postmortems or rendering breakdowns, they always present each layer (pass) of the frame individually and layer them on top of each other to get the final result. So for a simple game you might have your static terrain/tile layer, a layer for characters/enemies, and a layer for the UI. Usually each of these are separate render passes with their own state/objects, and so if the LÖV API presented something that let people express that, it would be a pretty big win. Even if it doesn't map directly onto a pipeline/renderpass, it still makes it way easier for the underlying LÖV implementation to do so.

I'm not sure but it seems like your RenderPass is going to store a list of commands in memory, and then serialize them to the command buffer/encoder at the time of execute. I'm going to try to do something different and record the commands to the API directly, so that I don't need to store additional memory and reduce overhead a bit. It seems like there are several reasons why this won't work, but I'm still going to try.

Somewhat related -- it could be confusing to have object state only used when passes are executed. The (current) alternative is to sprinkle flushes all over the place, which makes the API nicer but the implementation more annoying. I'm curious if that's still possible in the render pass setup or if it would be prohibitively expensive/complicated. Hmm.

After more research I understand why prerecorded command buffers might not be as necessary as I thought -- modern APIs are way faster at enqueueing draw calls than OpenGL, so it isn't a big deal to do that over and over again. There are still 2 reasons I might be interested in it A) reducing Lua-C overhead of enqueueing large numbers of unchanging draws B) optimizations that can be done when the set of draws are known (culling, sorting, more relevant for 3D, but I kinda lean towards pushing this to Lua anyway since it's so app-specific).

1 sampler per texture seems like a good approach. It can't really be worse than whatever is going on in OpenGL today. It looks like Vulkan drivers still do caching of samplers anyway. Maybe the lov.graphics default filter can be a global "cached" sampler.

I'm trying to get other work out of the way so I can focus more on implementing this stuff!

bjornbytes commented 4 years ago

Finally started laying the groundwork for this on a branch if you're interested in lurking:

https://github.com/bjornbytes/lovr/compare/gpu

Really just Vulkan boilerplate at this point.

bjornbytes commented 3 years ago

Finally worked up the masochism to start working on this stuff again.

Implemented this API for Texture views

lov.graphics.newTexture(texture, TextureType, firstLayer, layerCount, firstMip, mipCount)

The layer/mipmap stuff is optional. Could also make it a newTextureView function instead of further complicating newTexture.

I haven't tried using it yet but it may end up feeling nicer than passing around { texture, layer, mipmap } tables for texture attachments. It matches the modern APIs better and allows for more powerful stuff (texture type reinterpretation, maybe depth/stencil view stuff or swizzling in the future?).

EDIT: Also added Texture:newView(type, layer, count, level, count).

bjornbytes commented 3 years ago

The pass / command buffer API I'm going to try out is a lov.graphics.render function with two variants. The first one is:

lov.graphics.render(target, function() end)

target is the usual setCanvas table describing the attachments, load/store ops, etc.

This one is like Canvas:renderTo. It begins a (cached) render pass, calls the callback containing regular lov.graphics draw calls, and finishes the pass. Any graphics state/bindings set in the callback is temporary to the callback.

The second variant is for multithreading

lov.graphics.render(target, ...batchnames)

You pass in names of prerecorded batches you want to replay. Batches are (secondary) command buffers that can be recorded concurrently. There is a lov.graphics.record function for this:

lov.graphics.record(target, 'nickname', function() end)

You pass in the target you're going to replay on (sadly this is needed for vulkan/webgpu), a name to use for later replays, and a callback similar to the first variant. The batches are temporary and can only be submitted in the same frame they're recorded. The names are used instead of regular userdata to make it easier to use them between threads, avoid GC, and because there's just not a lot of benefit to retaining them since they're temporary.

(I want to explore more persistent batch objects later, but those are wayyy more challenging. They'd at least need to refcount all resources they use and potentially keep around copies of all the temporary matrices/uniforms).

One thing I like about this is that there are less breaking changes to the graphics module. A lot of code that is just setting state and drawing primitives/Drawables in lov.draw will continue to work. That wasn't the case when I was considering Batch/Pass objects.

I decided against the in-memory representation for the command buffers, at least for now. It has some benefits (you can sort/cull/reorder the draws, inspect/serialize the commands), but I really like the low-level approach where your graphics functions in Lua immediately hit GPU command buffers.

One kind of cool thing is that boot.lua can do lov.graphics.render(windowTarget, lov.draw). It might end up being more complicated than that if people want to do do other passes in the draw callback or submit batches instead. Maybe just a conf.lua flag though.

I'll report back on how it goes, I have to reorganize a bunch of command buffer/pass/framebuffer stuff first, may run into issues.

EDIT: Mostly dropped this due to design flaws. It almost worked, but in the end wrapping it in a Canvas / Pass object is preferable because you can be recording multiple passes at once and it avoids some clashes with global state. It's also just more lovely. So I am fully on board with Pass objects even though I was somewhat against them at first. I still have a function lovr.graphics.renderTo(textures|canvastable, function(canvas) end) for doing temporary render passes.

bjornbytes commented 3 years ago

Added depth bias and depth clamp states. Not really anything special.

Considering making blend modes and color masks per-target instead of global.

Current idea is for setBlendMode and setColorMask to take an optional target index, and if it's missing it applies to all targets (backwards compatible)

lov.graphics.setBlendMode('add') -- applies to all targets
lov.graphics.setBlendMode(1, 'add') -- only applies to first target

I'm not sure how the getters should work. They could either take an optional target index that defaults to 1, or they could return everything if the target is missing. It might be weird to have getColorMask() return 16 booleans...

bjornbytes commented 3 years ago

Here are the 3 types of buffers now (somehow they ended up matching opengl's roughly)

dynamic/transient are double buffered, and can not have storage usage

I can't imagine metal needs to worry about any of this...

EDIT: Actually dropped the 3 buffer types thing. Instead I'm using usage flags to detect what type of buffer memory to use (write flag says whether you want to write to it from CPU, transient (TBD) flag says whether it's okay to discard contents at the beginning of a frame).

bjornbytes commented 3 years ago

Considering only having 3 draw modes for the 'raw' drawing functionality like Mesh: points, lines, triangles. This seems to be more in-line with how D3D12/Metal do things. There will still be a primitive called line that will draw a line strip, but internally it will just use the lines draw mode plus an index buffer (unsure of performance caveats here).

Mm I guess love doesn't need to worry about lines as much since they're already polylines.

bjornbytes commented 2 years ago

Here is my API for queries:

bjornbytes commented 2 years ago

I just merged my Vulkan branch, so I won't be in brainstorming/API design mode as much. This issue was fun to have as a diary