Closed cai502 closed 10 months ago
IIRC this topic came up in another ticket as well. The sokol_gfx.h Metal backend does the frame synchronization in the first begin-pass of a frame here:
https://github.com/floooh/sokol/blob/b803c9a0214c6ab6dcb9cc6dd9d30d7ace4eda1e/sokol_gfx.h#L11550
E.g. at some point, sokol_gfx.h needs to wait for 'inflight resources' to become available again, and in the Metal backend there are two points where this might happen, either in that 'dispatch_semaphore_wait' call, or in sg_commit()
where a new swapchain drawable is requested:
(I guess that when the begin-pass flips to being 'fast', then sg_commit() flips to being 'slow').
TL;DR: at some point in the frame sokol-gfx needs to synchronize with vsync, and this is what you are seeing. If you add more render workload (so that the actual rendering takes longer), than this waiting period you're seeing should also decrease.
Here's that other ticket (that was GL on Linux, but all backends needs to wait for vsync somewhere, in GL it's just much less predictable where exactly that wait happens):
Oh I understand it, thanks for your detailed explanation! next question, If I want to measure how much time my code costs in every frame,I should not include sg_commit or sg_begin_default_pass?
And I want to profile sokol metal vs opengl performance on iphone and mac, I wonder is there has some benchmark sample stuff? If no, could you please give me some suggestions?
On GL, measuring performance by putting start/stop timer code around GL function calls is generally tricky, because it's unpredictable where GL might decide to 'flush the pipeline'.
I wrote a drawcall-overhead testing tool recently in the wip webgpu branch (not yet in master):
https://floooh.github.io/sokol-html5/drawcallperf-sapp.html
...idea is that you can roughly see at what point (== number of draw calls) the render loop is no longer able to hit the target frame rate. This "no longer able to hit target frame rate" gives a good idea of the CPU overhead in different backend APIs for specific rendering code.
Source code for this is here: https://github.com/floooh/sokol-samples/blob/sgfx-wgpu/sapp/drawcallperf-sapp.c
If I'm looking for specific peformance hotspots, I use CPU profilers like Instruments on macOS.
I think the Metal debugger in Xcode can also provide some performance numbers.
I noticed drawcallperf sample. I ran that sample and found metal has much more performance than opengl on mac.
Is the test result means metal is that faster than GL?
Thank you for your advice, I'll try to use these tools.
Yeah, on Mac you're definitely better off with the Metal backend. Apple's OpenGL implementation is only what's minimally needed for backward compatibility with older applications.
I'll close this ticket btw :)
PS: also, when testing peformance, make sure the Metal validation layer is disabled (for instance it is enabled when starting in debug mode within Xcode), the validation layer easily cuts peformance by 10x.
Reproduce step:
There is no such problem on opengl.
I don't know why metal has this behavior, is this a bug or something else?
The sample I run is cube-sapp,here is the code i profiled.