Closed rfomin closed 3 weeks ago
Same results as here: https://github.com/fabiangreffrath/woof/pull/1728#issuecomment-2166013695
This PR compared to master...
Desktop (i9-9900K) | Laptop (i5-2520M) | |
---|---|---|
CPU usage and power | Lower than master | Same as master |
Frame pacing | Worse than master | Same as master |
Input lag approximation | Same as master | Same as master |
Same results as here: #1728 (comment)
Is this PR good enough? I'd rather not use Sleep(0)
/yield as it's not recommended and doesn't change anything for me.
Is this PR good enough?
I don't know yet. This fixes the inconsistent frametimes by forcing the timer resolution to 0.5ms, but I don't like relying on undocumented features:
diff --git a/src/i_timer.c b/src/i_timer.c
index d492e5a7..77a21ccb 100644
--- a/src/i_timer.c
+++ b/src/i_timer.c
@@ -27,6 +27,9 @@
#ifdef _WIN32
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
+#pragma comment(lib, "ntdll.lib")
+typedef LONG NTSTATUS, *PNTSTATUS;
+#define STATUS_SUCCESS 0x00000000
HANDLE hTimer = NULL;
#else
@@ -152,6 +155,14 @@ void I_InitTimer(void)
}
#ifdef _WIN32
+ NTSYSAPI NTSTATUS NTAPI NtQueryTimerResolution(PULONG, PULONG, PULONG);
+ ULONG min, max, current;
+ if (NtQueryTimerResolution(&min, &max, ¤t) == STATUS_SUCCESS)
+ {
+ NTSYSAPI NTSTATUS NTAPI NtSetTimerResolution(ULONG, BOOLEAN, PULONG);
+ NtSetTimerResolution(max, true, ¤t);
+ }
+
// Create an unnamed waitable timer.
hTimer = CreateWaitableTimer(NULL, TRUE, NULL);
if (hTimer == NULL)
@ceski-1 Does https://github.com/fabiangreffrath/woof/pull/1744/commits/f84edd8579cc7d54cae7b57b201e2cdf16c5bc2d improve frametimes?
@ceski-1 Does f84edd8 improve frametimes?
No, it's the same. I should note that if the in-game limiter is disabled (framerate limit 0, vsync off), and then the Nvidia limiter is used instead, it has the same good frame pacing as the master branch, the same high cpu usage (so probably busy waiting as well), so maybe this is just the unavoidable trade-off for an accurate framerate limiter.
maybe this is just the unavoidable trade-off for an accurate framerate limiter.
With high resolution timer we can try just this:
diff --git a/src/i_video.c b/src/i_video.c
index aa43762b..b6bfdf6b 100644
--- a/src/i_video.c
+++ b/src/i_video.c
@@ -792,10 +792,7 @@ void I_FinishUpdate(void)
break;
}
- if (target_time - elapsed_time > 1000)
- {
- I_SleepUS(500);
- }
+ I_SleepUS(target_time - elapsed_time);
}
}
else
Frame pacing is worse on my system. It seems that sleep is still not precise enough.
Could you try the no-loop-unrolling hack from #1728 in this branch?
Could you try the no-loop-unrolling hack from #1728 in this branch?
We only have a few instructions in the loop, so it doesn't matter if it's unrolled/inlined/cached or not, API calls (usleep
or WinAPI timer) will affect performance much more.
Anyway, I tried, there is no difference for me.
@ceski-1 Does https://github.com/fabiangreffrath/woof/pull/1744/commits/f84edd8579cc7d54cae7b57b201e2cdf16c5bc2d improve frametimes?
Actually, I retested this one several more times, as well as the one right before it. The CREATE_WAITABLE_TIMER_HIGH_RESOLUTION
change in https://github.com/fabiangreffrath/woof/commit/f84edd8579cc7d54cae7b57b201e2cdf16c5bc2d actually does fix the frame times so now it matches the master branch but with lower CPU usage.
Actually, I retested this one several more times, as well as the one right before it. The
CREATE_WAITABLE_TIMER_HIGH_RESOLUTION
change in f84edd8 actually does fix the frame times so now it matches the master branch but with lower CPU usage.
That's great! Then I think this PR is ready.
@MrAlaux Could you please try the artifact from this PR on Win7?
@MrAlaux Could you please try the artifact from this PR on Win7?
Sure thing, but I'm a bit busy at the moment, so it might take me a while to get to it.
What exactly do I have to check, though?
What exactly do I have to check, though?
If it runs, it may fail at timer initialization. It will probably work, I will test it in Win7 VM.
I have tested on a Win7 VM, it works now. I consider this PR complete, so if there are no objections I'd like to merge this.
I finally gave this a try, setting the FPS limit to 45, and it feels like this PR is more responsive than the previous master. Needless to say, I didn't catch any errors, crashes and whatnot -- assuming that's what a "timer initialization failure" would be anyways.
Remove I_CpuPause().
Change limiter sleep loop.
I tested this with direct3d9 backend, resolution scale 100%, vsync off, frame limiter 200. With CapFrameX I've got a slightly better frame pacing with about the same CPU load. @ceski-1 I am interested in your results.