Use Queue instead of StagingBelt for buffer/texture writes.

hecrj / wgpu_glyph

A fast text renderer for wgpu (https://github.com/gfx-rs/wgpu)

https://docs.rs/wgpu_glyph

MIT License

450 stars 78 forks source link

Use Queue instead of StagingBelt for buffer/texture writes. #69

Closed profan closed 2 years ago

profan commented 3 years ago

I was having a situation where StagingBelt was leaking a tremendous amount of memory (like 10MB/second given my use of wgpu_glyph, so I took the liberty of swapping out StagingBelt use for Queue), it seems you may already have had the same idea given the comment inside cache.rs?

I also fixed my issue with the cache being corrupted (it was me leaving in half the buffer use that wasn't necessary anymore, order of writes to queue vs encoder use wasn't what I expected), the only thing which might be a little weird is the now local Vec kept in cache.rs to hold the padded data and the use of unsafe there for set_len, but here it is anyways, ready for review 👀

Mubelotix commented 3 years ago

This is a great idea but aren't StagingBelt supposed to be more efficient? Did you open an issue?

profan commented 3 years ago

This is a great idea but aren't StagingBelt supposed to be more efficient? Did you open an issue?

StagingBelt currently leaks a dramatic amount of memory and from what I've heard from some of the people working on wgpu-rs side things, it seems while it technically does avoid a copy, it's easier API-side to "do the right" thing apparently efficiency wise if you just use queue's write_buffer directly?

That said.. no I haven't opened an issue but this exists: https://github.com/gfx-rs/wgpu/issues/1441 which is just kind of like.. unresolved, also referenced in the iced repo here: https://github.com/hecrj/iced/issues/786#issuecomment-835347214

I'd assumed given this comment: https://github.com/hecrj/wgpu_glyph/pull/69/files#diff-7bd5d211a8e8c645d47359cecc7cd50ee00aebd67c8a2e8e864ee5ef9e8b458eL95

... that moving to use Queue was the end goal anyways, but it's up to hecrj in the end yes, I just made this change for myself largely because it was simpler than using StagingBelt/going to try and supply an official fix for StagingBelt

profan commented 3 years ago

The main issue with using Queue::write_buffer is that we lose composability.

With a StagingBelt, writes are recorded in the CommandEncoder. With a Queue, writes are global and happen all at once during submit. This means that the buffers will be overwritten and only contain the data of the last draw_queued call.

For instance, the clipping example is most likely broken here.

Oh yeah, you're right, the expected ordering does fall apart basically, oof.

Hmm, maybe fixing StagingBelt is the easier solution in this case then actually...

schell commented 3 years ago

Just a note because I just figured this out - since these changes now use Queue::write_texture, padding the data to multiples of COPY_BYTES_PER_ROW_ALIGNMENT is not necessary and so padded_cache can disappear.

https://docs.rs/wgpu/0.9.0/wgpu/constant.COPY_BYTES_PER_ROW_ALIGNMENT.html

profan commented 3 years ago

Just a note because I just figured this out - since these changes now use Queue::write_texture, padding the data to multiples of COPY_BYTES_PER_ROW_ALIGNMENT is not necessary and so padded_cache can disappear.

https://docs.rs/wgpu/0.9.0/wgpu/constant.COPY_BYTES_PER_ROW_ALIGNMENT.html

Good catch, changed it, simplifies things a bit, things are still bent as write_texture/write_buffer don't compose the same way the queued encoder operations do with StagingBelt, but I might have a stab at making that work some time soon to make stagingbelt unnecessary

hecrj commented 2 years ago

Closing in favor of #76.