john-chapman / im3d

Immediate mode rendering and 3d gizmos.
MIT License
1.18k stars 63 forks source link

Direct access to draw data outside of callback. #33

Closed KenzieMac130 closed 6 years ago

KenzieMac130 commented 6 years ago

Hello, I've been enjoying this library immensely and recommended it to multiple people for use in their own projects a little while back. It is a really useful self-contained library. During my tweet, I also pointed out in a reply on how I would go about getting the project working with newer deferred APIs. I feel like I did a pretty lousy job at explaining myself and why I thought the limited lifespan of draw data presented issues, so I thought I would elaborate on my points here.

The current examples suggest to use this process as a callback:

Callback

  1. SetupViewportScissors
  2. Upload array to Vertex Buffer
  3. Set Shaders/Push Constants
  4. Draw

This works well if Draw executes immediately but in a more modern API like Vulkan, it basically adds it to a TODO list. We would end up overwriting the vertex data needed by all of the earlier draw commands. You would have to manually execute all of the commands leading up to it which would incur a massive performance penalty.

So you would probably want to make sure all of the vertex data lasts until the end of all render commands, to do this you could load all of the data into a global vertex buffer. So here would be your' callback would do: (this is what my program deals with Im3d draw data at the moment)

Callback

  1. Setup Viewport/Scissors
  2. Reallocate Vertex Buffer to fit additional vertex data if necessary (Executes Immediately)
  3. Upload array to the END of the Vertex Buffer (Executes Immediately)
  4. Set Shaders/Push Constants
  5. Issue Draw Command with END as an offset into the buffer
  6. Offset the END of the Vertex buffer by the number of verts to draw.

Now this works but is hardly efficient since it requires us to reallocate the vertex buffer if there is not enough room and since this is performed every time the callback is issued, having to resize the buffer is incredibly wasteful. Keep in mind you don't need to resize all of the time or even every frame if you offset what is the END of the vertex buffer back to the beginning and simply overwrite the last frame's data. But if one of your earlier vertex arrays demands more space that can have a ripple effect on the rest of them, resulting in a TON of unnecessary allocations and copies of the entire global vertex buffer.

Here is where the lack of persistent/accessible draw data outside of the callback is an issue.

This is a way we could cut down on all of those allocations and copies (how my application deals with ImGui draw data):

Do Once

  1. Get a sum of the number of vertices in all of the vertex arrays in the draw lists.
  2. Recreate the global vertex buffer only if necessary.

For Each Draw List

  1. Upload array to the END of the Vertex Buffer (Executes Immediately)
  2. Setup Viewport/Scissors
  3. Set Shaders/Push Constants
  4. Issue Draw Command with END as an offset into the buffer
  5. Offset the END of the Vertex buffer by the number of verts to draw.

This should perform a whole lot better when the amount of vertices in the scene is changing since we would only need to do it once up front (and since we don't even care about the contents before the resize we wouldn't need to perform a copy of the data to the new larger buffer.)

Currently, however, we can't do this since we don't know the contents or even the size of any of the vertex data outside of a one-time callback. By the time the callback is over, that data would be gone and before the render function is called that data probably isn't there.

So I hope this scenario demonstrates the user's possible need to go around the callback function. Thank you for the amazing work!

john-chapman commented 6 years ago

Hi - thanks for clarifying the issue! I think I understand better now: all you really want is to know the total number of vertices that will be drawn before the callbacks happen, so that you can (re)allocate your vertex buffer up front.

Actually this is already possible but it's a sort of 'backdoor' in the API and looks hacky:

U32 totalVertices  = 0;
totalVertices += GetContext().getPrimitiveCount(DrawPrimitive_Triangles) * 3;
totalVertices += GetContext().getPrimitiveCount(DrawPrimitive_Lines)     * 2;
totalVertices += GetContext().getPrimitiveCount(DrawPrimitive_Points)    * 1;

U32 allocationSize = totalVertices * sizeof(VertexData);

Of course, this information is only accurate at the time you call Draw() - if you do any Im3d:: calls between allocating the buffer and calling Draw() (or MergeContexts() for a multi-context setup), your buffer will be the wrong size!

I'm thinking about ways to enforce this ordering constraint via the API. The best solution I can think of is to have the callback called only once, so that the app can consume all of the pending draw lists at once and have global info about the frame, e.g.:

void Im3d_Draw(const Im3d::DrawList _drawList[], int _drawListCount)
{
 // loop over all the draw lists and count vertices
 // (re)allocate resources
 // issue draw calls

Right now I don't see any negative implications of doing it this way (I can still support the old-style callback) and it maintains the single point of contact between the app and the draw data (which I think simplifies things a lot).

Any additional thoughts or ideas are appreciated!

KenzieMac130 commented 6 years ago

Thank you very much for the response and recommendation on a workaround.

The callback with a list of commands seems like a much better way to approach it.

If I may bring in another library for an example for a moment, ImGui has made the render callback function obsolete (still available by default though) and instead recommends the user call a function GetDrawData() when they need access to draw lists from the library.

In the case where the users might want to distribute the draw call recording/buffer management payload between threads this method of direct access would be the best way to do it, only requiring the threads to all sync with the get list function. Instead of having to be invoked or wrangled by a callback.

john-chapman commented 6 years ago

Ok, so having 'direct' access to the draw data is possible but will require a couple of changes to the API. Right now, calling Draw() finalizes the draw data internally (really this just means doing the sorting), and then calls the app callback. So this finalization step would still need to happen, however I'd probably move it to an EndFrame() function - after calling this the draw lists would be available until you call NewFrame() again.

I think I'm happy with doing it this way, rather than adding a new type of callback. I'll therefore mark this issue as 'todo' and put it at the top of my list. Thanks again for your input!

john-chapman commented 6 years ago

The new API is now available in version v1.13. The examples have been updated to use the new API (draw callback is marked deprecated but I'll probably not remove it any time soon).

Thanks again for your valuable input!