Open Wumpf opened 8 months ago
I gave this a shot for my own personal project, but I'm struggling to figure out what I should do with the fact that it takes several frames for timestamp queries to make the round trip from the GPU back to the CPU. As it is now, there's a massive gap between the GPU tasks and CPU, and it makes it look like the framerate is extremely low. Do you have any suggestions?
Haven't looked into this yet, but what needs to be done is to figure out which gpu frame should be associated with which cpu frame. That part shouldn't be too hard since wgpu-profiler can be identified by label, see https://github.com/Wumpf/wgpu-profiler/blob/main/src/profiler_query.rs#L9. Labels are ofc tedious and awkward so to make this nicer https://github.com/Wumpf/wgpu-profiler/issues/54 needs solving, but it would be a start. Once that is done, the profiling information emitted to puffin should be able to take this into account.
Not sure if that would need modifications to puffin itself. @emilk without digging too deep does this make sense and do you have an idea how to communicate to puffin that something belongs to an already closed frame?
Something that I didn't realize is that I could make a separate puffin::GlobalProfiler
for just the WGPU scopes. The documentation for puffin_http
mentions something similar for using separate profilers for different groups of threads: https://docs.rs/puffin_http/latest/puffin_http/struct.Server.html#method.new_custom. Here's a screenshot of this working with wgpu-profiler:
However, this solution feels pretty ugly, since the WGPU and regular cpu stuff has to be in two separate windows. I'll look into your suggestion. From what I understand, you're suggesting to save the frame number in the label of the scope. Then when I do process_finished_frame
, I parse that label to see which frame that snapshot is from. That could work, but the tricky part will be to report it to puffin
. I think I'll need to mess around with the GlobalProfiler::add_frame
function to see if it can add data to frames that have already happened.
Here's a code review of my current approach using a separate GlobalProfiler
for the GPU. I'm still not quite thrilled with that, but I haven't had much luck combining the frame data I receive from wgpu-profiler
with the main puffin::GlobalProfiler
It should be possible to have both CPU & GPU traces be shown in sync in Puffin 🤔
(Add an example screenshot of that to readme if it works out!)