Closed stephanlachnit closed 2 years ago
Using git bisect, I was able to found the commit that introduced the regression: abf146f73c8baf79cfd3e2d25b02beaf750e1116
abf146f73c8baf79cfd3e2d25b02beaf750e1116 is the first bad commit
commit abf146f73c8baf79cfd3e2d25b02beaf750e1116
Author: jackun <jack.un@gmail.com>
Date: Sat Oct 2 16:42:37 2021 +0300
Set correct swapchain_stats etc references for hw updater
src/overlay.cpp | 44 +++++++++++++++++++++++++-------------------
src/overlay.h | 1 +
2 files changed, 26 insertions(+), 19 deletions(-)
Looks like a locking issues, I'll dig a little bit deeper.
It looks like every run/update/update_hw_info call blocks the entire process:
[2021-11-09 22:31:03.941] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:04.410] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:04.410] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
[2021-11-09 22:31:04.933] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:05.412] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:05.412] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
[2021-11-09 22:31:05.925] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:06.412] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:06.412] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
[2021-11-09 22:31:06.917] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:07.413] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:07.413] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
[2021-11-09 22:31:07.941] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:08.415] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:08.415] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
5 frames in 5.0 seconds = 0.999 FPS
[2021-11-09 22:31:08.933] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:09.417] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:09.417] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
[2021-11-09 22:31:09.925] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:10.419] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:10.419] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
[2021-11-09 22:31:10.953] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:11.420] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:11.420] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
[2021-11-09 22:31:11.941] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:12.421] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:12.421] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
[2021-11-09 22:31:12.933] [MANGOHUD] [debug] [overlay.cpp:140] run called
[2021-11-09 22:31:13.422] [MANGOHUD] [debug] [overlay.cpp:134] update called
[2021-11-09 22:31:13.422] [MANGOHUD] [debug] [overlay.cpp:49] update_hw_info called
5 frames in 5.0 seconds = 0.998 FPS
That's exactly 1 frame for each call.
Ok I think I figured it out. The problem is that std::unique_lock<std::mutex> lk(m_hw_updating);
is called both in update()
and run()
. However to my understanding this is not necessary: one only needs to lock when the actual updating is done, not when calling update()
. As update
only notifies to listeners, nothing happens if run()
is still updating. And indeed, when commenting out the lock in update()
everything works fine.
Here is a log with the lock in update()
removed:
[2021-11-09 22:57:03.581] [MANGOHUD] [debug] [overlay.cpp:127] update: called
[2021-11-09 22:57:03.581] [MANGOHUD] [debug] [overlay.cpp:139] run: calling cv_hwupdate.wait()
[2021-11-09 22:57:03.581] [MANGOHUD] [debug] [overlay.cpp:145] run: creating unique_lock with mutex m_hw_updating
[2021-11-09 22:57:03.581] [MANGOHUD] [debug] [overlay.cpp:147] run: calling update_hw_info()
[2021-11-09 22:57:04.087] [MANGOHUD] [debug] [overlay.cpp:127] update: called
[2021-11-09 22:57:04.105] [MANGOHUD] [debug] [overlay.cpp:149] run: update_hw_info() returned
[2021-11-09 22:57:04.105] [MANGOHUD] [debug] [overlay.cpp:139] run: calling cv_hwupdate.wait()
[2021-11-09 22:57:04.603] [MANGOHUD] [debug] [overlay.cpp:127] update: called
[2021-11-09 22:57:04.603] [MANGOHUD] [debug] [overlay.cpp:145] run: creating unique_lock with mutex m_hw_updating
[2021-11-09 22:57:04.603] [MANGOHUD] [debug] [overlay.cpp:147] run: calling update_hw_info()
[2021-11-09 22:57:05.104] [MANGOHUD] [debug] [overlay.cpp:127] update: called
[2021-11-09 22:57:05.125] [MANGOHUD] [debug] [overlay.cpp:149] run: update_hw_info() returned
[2021-11-09 22:57:05.125] [MANGOHUD] [debug] [overlay.cpp:139] run: calling cv_hwupdate.wait()
[2021-11-09 22:57:05.620] [MANGOHUD] [debug] [overlay.cpp:127] update: called
[2021-11-09 22:57:05.620] [MANGOHUD] [debug] [overlay.cpp:145] run: creating unique_lock with mutex m_hw_updating
[2021-11-09 22:57:05.620] [MANGOHUD] [debug] [overlay.cpp:147] run: calling update_hw_info()
[2021-11-09 22:57:06.120] [MANGOHUD] [debug] [overlay.cpp:127] update: called
[2021-11-09 22:57:06.153] [MANGOHUD] [debug] [overlay.cpp:149] run: update_hw_info() returned
As one can see, update is called more often than update_hw_info()
returns. With locks in both places this creates a delay in update()
when update_hw_info()
is not finished yet.
Ah right I see why you want a lock there, sw_stats
, params
, vendorID
and update_hw_info_thread
shouldn't be changed while hw_info
updates. The proper solution is to skip update()
if the mutex is locked.
Weird, don't see it even with 1220 fps (well, depends more on fps_sampling_period etc). But the whole thing is kind of going to blow up any moment now
Weird, don't see it even with 1220 fps. But the whole thing is kind of going to blow up any moment now
I guess it depends on how long your update_hw_info()
takes. If it is fast enough than you won't see it. Maybe measure the duration to check if this is the case.
With the latest version of MangoHud I get some crazy performance issues. It's kinda lagging the application (see the framegraph below), and it is getting worse over time. Affects both Vulkan and OpenGL.
Now:
Before:
![mangohud_old](https://user-images.githubusercontent.com/36662302/140996557-bf011f0d-ca2e-4988-a635-8230012c3f46.png)
Any idea how I can debug this? The trace log (see below) is not very conclusive.
Full debug log:
Meson settings: