Remove implicit frame cleanup

Nuckal777 commented 2 years ago

Currently glyph_brush assumes that all text to be rendered in a frame is queued and then drawn at once with a single draw call. The different renderer implementations provide these draw methods, which call GlyphBrush::process_queued() under the hood, which currently implicitly maintains the cache. Using one draw call for all text in a single frame may be unfeasible, e.g. rendering text with transparency and proper z-ordering or using multiple scissoring regions, which causes many cache cleanups and misses. By allowing the user to manually control the cache cleanup, that issue can be avoid.

Resolves #128.

alexheretic commented 2 years ago

Thanks for the pr. I'm away at the moment so won't be able to review for a little while.

Nuckal777 commented 1 year ago

Any updates?

alexheretic commented 1 year ago

My holiday is going well thanks ;)

On Fri, 14 Oct 2022, 12:52 Erik Schubert, @.***> wrote:

Any updates?

— Reply to this email directly, view it on GitHub https://github.com/alexheretic/glyph-brush/pull/156#issuecomment-1279244597, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARZHVZWJ2C2RDDDLI4MLW3WDGFU3ANCNFSM6AAAAAAQ22QOEQ . You are receiving this because you commented.Message ID: @.***>

alexheretic commented 1 year ago

I'm back, thanks for waiting. My first thought for this is that I don't want to break current behaviour, but perhaps adding this in a non-breaking way would be ok. Like a new method process_queued_no_cache_cleanup or a new option auto_cache_cleanup that can be disabled (along with pubbing the cleanup_frame fn).

However, thinking about it more I'm less sure this is the way to do multiple draws in a frame. Lets take a 2 draw example:

queue section(s)
process_queued (without auto cleanup)
queue section(s)
process_queued (without auto cleanup)
cleanup_frame

Issues:

(2) & (4) will never return BrushAction::ReDraw since the last_draw will always be different to the next. This means "gpu draw caching" / fast re-drawing (where we avoid all layout & vertex calculation for identical frames) will not work at all. This is actually the most common case, where the frame's text hasn't changed from the last, so it provides a huge benefit.
Not cleaning the section_buffer at (2) means at (4) we'll re-add the (1)-sections vertices along with the (3)-sections vertices to the Draw(vertices) vec. So the earlier queued sections will be drawn twice (or more generally multiple times until cleanup).

The 2nd point can be addressed with code making the struct multi-draw aware. The 1st point not so much & I think one of the key advantages of this crate is fast re-drawing.

Alternative

I did require multiple text draws when I made Robo Instructus. I used multiple GlyphBrushes, one for each draw. This means each draw has it's own cache & gpu pipeline. That allows, when the text is the same as the last frame, a simple re-draw without any texture or vertex upload to the gpu. If one pipeline changes the others can still fast re-draw unaffected.

The downside is they can't share a single glyph rasterization DrawCache, since that is within each GlyphBrush. For the other caches contained within (like layout & vertex caches) we wouldn't expect to be useful between different pipelines anyway so having multiple is fine. Overall I found the benefits of fast re-draws out way the downside of having multiple texture draw caches (particularly if the different pipes use different fonts/sizes anyway).

One thing that could improve this to allow multiple GlyphBrush instances to share an external DrawCache texture. I have considered this before, but it would require some reworking as the DrawCache is not currently designed for sharing.

So overall, perhaps you could try using a GlyphBrush instance for each draw inside a frame as I did? And if this doesn't work for you, can you help me understand why not?

Nuckal777 commented 1 year ago

Hey :wave:, thanks for the super detailed reply. Appreciate it.

I had the misconception that the GlyphBrush type would be the "super fat, expensive to construct, keep all state around" struct. Maybe that's the cause for the plenty issues about multiple draws per frame. At least the fact of it being able to store glyphs for multiple fonts and the whole queue sections and draw once, gave me that impression. The docs also give me that vibe.

When thinking about one GlyphBrush per "text" that is supposed to change together with the brush, the whole things makes more sense. E.g. each brush having it's own texture and vertex buffer for glyphs, so that updates to a different "text" do not impact the current one. Depending on the renderer structure many levels above that may require to maintain sort of a HashMap<Section, GlyphBrush>, if sections come from somewhere else and are not stored otherwise (That's not the full story, but the basic idea). So the "keep all state around" is expected by the user.

I guess the takeaway is, that it will be beneficial to mention somewhere that GlyphBrushes aren't super expensive structs, but are rather light and there can be easily be some hundreds of them. Maybe you can polish what you put here and put it the readme or doc string as "How it works"?

alexheretic commented 1 year ago

Yeah I think you're right that the docs could be improved! Perhaps a section on multi-draw per frame something like this discussion?

GlyphBrush isn't expensive to construct, it's really the using it which is expensive.

Of course it's still better to use as few GlyphBrush/pipelines/draws as possible per frame. But that's generally true of draws anyway. So I wouldn't go as far as "one GlyphBrush per "text"", rather one GlyphBrush per draw & to multi-draw/multi-GlyphBrush only when necessary.

Nuckal777 commented 1 year ago

May I ask where you think the expensive part of using multiple GlyphBrushs is (assuming the sections one queues do not change and each brush gets the same sections as in the previous frame)? The code paths I looked at (process_queued() and queue_custom_layout()) have a fast path based on the section hashes.

Certainly, the texture usage is somewhat inefficient. The content of vertex buffers have to be allocated anyway. The GPU pipeline should be sharable potentially (for the same TextureFormat, etc.). There will be more draw calls.

Right now my understanding is like:

plenty of GlyphBrushs do not directly cripple performance
- I use wgpu_glyph and it would create a wgpu::RenderPass for each brush, which is expensive and adds up fast. Though that's a wgpu_glyph issue.
less brushes are better due to the reasons outlined above

alexheretic commented 1 year ago

Yeah that sounds about right to me. Re-draws are very fast so having multiple draws with most re-drawing should also be fast.

So I think the only real issue is each pipeline having it's own draw cache texture, and as I said before in some cases the pipelines would use different fonts/sizes anyway so even that isn't too bad.

It's still generally true in graphics APIs that if you can bundle more things into less draws it's better. So use as many GlyphBrushes as you need, but using as few as possible will keep draw counts down. Re-draws are fast but it's still better to have fewer of them.

alexheretic commented 1 year ago

I've opened #157 & #158 issues to track the next steps here. Lets follow them up there and, of course, let me know if that's missing anything!

alexheretic / glyph-brush

Remove implicit frame cleanup #156

Alternative