godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
88.92k stars 20.16k forks source link

Single frame spikes even in an empty project #33969

Closed ghost closed 1 year ago

ghost commented 4 years ago

Godot version: 3.1.1, 3.2b2

OS/device including version: Windows 10 Pro 1909 x64

Issue description: Every ~10 seconds i get a one-frame spike followed by 1-3 really short frames. At first i thought it was my fault, but problem persists even in an empty project.

Exported project exhibits same behavior.

My specs are i5 7300, 16gb DDR4, GTX 1050

Profiler screenshots:

Frame before the spike: Annotation 2019-11-28 204133

Spike: Annotation 2019-11-28 204205

Frame after the spike: Annotation 2019-11-28 204229

Steps to reproduce: 1) Attach a script with empty "_process" method anywhere in the scene, so that Editor may pause the scene (only for a profiler screenshot) 2) Run project and wait for ~10 sec

EDIT: So, in trying to narrow down the issue(at least as far as my abilities go), i tested this on some other machines

i7 - 9750H / 16 Gb DDR4 / RTX-2060 w/ Windows 10 - same thing, i3 - 3240 / 4 Gb RAM (didn't check at the time, i believe DDR3) / GeForce 8600GT w/ Windows 8.1 - same thing

I tried to check the first machine (i5 7300, 16gb DDR4, GTX 1050) with Manjaro 18.1.5 and didn't notice any of similar problems (although i couldn't get Vsync to work). Project, exported on Manjaro to Windows exhibits same behavior

3.2 rc4 didn't help.

EDIT 2: The Editing I decided to try Vulkan branch just to see if something different happens. Spikes aren't there, but i noticed really strange things happen with everything that is normalized for time (with delta), i haven't noticed this before because of short lag, but in Vulkan that short lag isn't there, and after some investigating, i found that sometimes delta is negative(with same frequency as spikes in stable version). Going back to 3.2, negative delta does happen at the same time as spikes, but not everytime.

Annotation 2020-02-06 222053

So i guess, question is, could VSync issues and negative delta be related?

Keelar commented 3 years ago

I've spent the past few hours trying to further track down where the issue is. I started by searching for all code that executes as a result of OS::get_singleton()->is_stdout_verbose() being true and taking some educated guesses and forcing/preventing code from running until the lag spikes start/stop. Doing that for a little while led me to this code:

https://github.com/godotengine/godot/blob/7610409b8a14b8499763efa76578795c755a846d/drivers/gles3/rasterizer_gles3.cpp#L163-L171

If I run with verbose stdout enabled and prevent that code from running by changing the if condition to OS::get_singleton()->is_stdout_verbose() && false, verbose stdout no longer prevents the lag spikes. If I disable verbose stdout and force that code to run by changing the if condition to OS::get_singleton()->is_stdout_verbose() || true, the lag spikes stop again. I've put at least 2 hours into testing this and forcing that code to run has consistently resulted in no lag spikes, while preventing it from running consistently has. I have absolutely no clue why forcing debug code to run would result in better performance, but it seems to be the case.

Around a year ago I checked to see if the lag spikes were present on the GLES2 renderer as well and they were and I just checked and rasterizer_gles2.cpp appears to have a similar bit of code:

https://github.com/godotengine/godot/blob/7610409b8a14b8499763efa76578795c755a846d/drivers/gles2/rasterizer_gles2.cpp#L221-L229

While I haven't yet tested if forcing/preventing this code from running has the same effect on GLES2, I presume it would, but don't know for sure.

So it seems to be related to OpenGL, which I am clueless about so I don't think there is much more I can help with at this point. Just hoping this info will be helpful to someone more knowledgeable.

lawnjelly commented 3 years ago

That's very useful info and you could be well right about the OpenGL debugging. :+1: I don't get this on my Linux machine (hence difficult to investigate), and the other reports seem to be about windows, so there is also a possibility it could be windows specific.

There is a _WGL_CONTEXT_DEBUG_BIT_ARB in context_gl_windows.cpp but as far as I can see it is switched off. We can have a look through and see if there's anything we can spot.

One possibility is that there is a load of OpenGL error spam which is getting hidden, and that is causing problems, possibly related to #7171 and the linked PR. I'm not sure I can really investigate effectively this as it needs someone running windows I think, possibly someone who also gets the spikes.

joemicmc commented 2 years ago

I've seen the same issue on my setup

Every 10 seconds or so I was seeing 'Process Time' take double the amount of time on a frame.

This was happening in a new project with empty scene. Enabling verbose stdout, whilst not completely eliminating the issue has definitely reduced the issue, and it could be that the few spikes I'm seeing are unrelated.

I haven't build Godot from source for a while, but happy to test out any changesets if needed.

lawnjelly commented 2 years ago

I've seen the same issue on my setup

  • Godot 4.0 alpha 14

This probably needs a separate issue for Godot 4, especially for Vulkan. This issue seems to have been narrowed down to something to do with OpenGL debugging. Obviously OpenGL debugging doesn't happen in Vulkan.

Incidentally, not having looked at this for some time, I had a little look and the debugging section is followed by a whole load of:

        glDebugMessageControlARB(GL_DEBUG_SOURCE_API_ARB,GL_DEBUG_TYPE_ERROR_ARB,GL_DEBUG_SEVERITY_HIGH_ARB,0,NULL,GL_TRUE);
        glDebugMessageControlARB(GL_DEBUG_SOURCE_API_ARB,GL_DEBUG_TYPE_DEPRECATED_BEHAVIOR_ARB,GL_DEBUG_SEVERITY_HIGH_ARB,0,NULL,GL_TRUE);
        glDebugMessageControlARB(GL_DEBUG_SOURCE_API_ARB,GL_DEBUG_TYPE_UNDEFINED_BEHAVIOR_ARB,GL_DEBUG_SEVERITY_HIGH_ARB,0,NULL,GL_TRUE);
        glDebugMessageControlARB(GL_DEBUG_SOURCE_API_ARB,GL_DEBUG_TYPE_PORTABILITY_ARB,GL_DEBUG_SEVERITY_HIGH_ARB,0,NULL,GL_TRUE);
        glDebugMessageControlARB(GL_DEBUG_SOURCE_API_ARB,GL_DEBUG_TYPE_PERFORMANCE_ARB,GL_DEBUG_SEVERITY_HIGH_ARB,0,NULL,GL_TRUE);
        glDebugMessageControlARB(GL_DEBUG_SOURCE_API_ARB,GL_DEBUG_TYPE_OTHER_ARB,GL_DEBUG_SEVERITY_HIGH_ARB,0,NULL,GL_TRUE);

What could be happening is that on some drivers this output is defaulting to true (maybe something is overriding the debug settings, in the driver settings perhaps?). This suggests that it could possibly be solved by explicitly setting all this output to FALSE when not in verbose mode.

In addition, it looks like we don't set WGL_CONTEXT_DEBUG_BIT_ARB on windows, so in theory this stuff shouldn't be outputting, but hey ho.

An alternative is to set the callback even in non-verbose mode, then totally ignore any debug output.

Again, this probably needs someone who uses windows and has this problem to investigate, but these are some ideas.

https://sites.google.com/site/opengltutorialsbyaks/introduction-to-opengl-4-1---tutorial-05

EIREXE commented 1 year ago

I asked around and apparently enabling _EXT_DEBUG_OUTPUT_SYNCHRONOUS_ARB disables threaded optimization, which AFAIK is only enabled by default on windows, hence why we are seeing the issue only there

lawnjelly commented 1 year ago

I asked around and apparently enabling _EXT_DEBUG_OUTPUT_SYNCHRONOUS_ARB disables threaded optimization, which AFAIK is only enabled by default on windows, hence why we are seeing the issue only there

That could well be it :+1: , a google search reveals a lot of people having issues with stuttering and this threaded optimization driver setting. As such I don't know if there's anything we can do about it our side, aside from mention it as a solution (or force debug output lol) :grin: .

https://en.sfml-dev.org/forums/index.php?topic=23506.0 https://stackoverflow.com/questions/36959508/nvidia-graphics-driver-causing-noticeable-frame-stuttering/37632948#37632948

It may be something we do for error checking (glGetError?) that is causing a stall with this driver setting, as some of the people report the problem disappearing in release builds of their game.

EIREXE commented 1 year ago

I asked around and this is what I got:

Does it disappear on release builds? I know that for nvidia we can make a profile and force threaded optimization off, as its done here: https://stackoverflow.com/questions/36959508/nvidia-graphics-driver-causing-noticeable-frame-stuttering/37632948#37632948

Someone should attach a debugger like nvidia nsight to the engine to see where the stalls are, i'll do it myself if I have time later.

Not sure if this option can be disabled on intel, does anyone know?

EIREXE commented 1 year ago

Can someone confirm if this is happening in 4.0 with opengl? I might just add some code at initialization to disable threaded optimization, since it seems like a better way to go