Closed cybereality closed 3 years ago
I get 36fps in beta6 and 21fps in RC9. It's not the 2D (I tried even turning off the fps label, and no change).
Also RC3 gets 36fps, so it is fairly recent. RC4 has 26fps. So the bulk of the change is between RC3 and RC4. RC6 has 26fps.
Note that there's a baked lightmap in the MRP, so it needs to be rebaked when switching versions (at least between 3.2.3 and 3.2.4 beta 6+ which added the new CPU lightmapper).
My tests results on Linux with:
GPU | 3.2.3-stable | 3.2.4 beta 1 | 3.2.4 RC 3 | 3.2.4 RC 4 | 3.3 RC 6 | 3.3 RC 7 | 3.3 RC 9 |
---|---|---|---|---|---|---|---|
Intel HD 630 | 47 FPS | 47 FPS | 46 FPS | 45 FPS | 45 FPS | 38 FPS | 38 FPS |
Radeon RX Vega M | 160 FPS | 160 FPS | 155 FPS | 155 FPS | 155 FPS | 137 FPS | 137 FPS |
So I can't reproduce @lawnjelly's performance drop between RC 3 and RC 4. The main difference between the two is that RC 4 spams errors because the MRP is using invalid input mappings. This could cause a FPS drop (in debug builds only) when spamming the stdout (especially on Windows):
ERROR: get_action_strength: Request for nonexistent InputMap action 'look_right'.
At: main/input_default.cpp:142.
ERROR: get_action_strength: Request for nonexistent InputMap action 'look_left'.
At: main/input_default.cpp:142.
ERROR: get_action_strength: Request for nonexistent InputMap action 'look_down'.
At: main/input_default.cpp:142.
ERROR: get_action_strength: Request for nonexistent InputMap action 'look_up'.
At: main/input_default.cpp:142.
I do reproduce a performance drop between 3.3 RC 6 and RC 7 however, on both Intel and Radeon on Linux.
And I seem to have a tiny performance drop somewhere between 3.2.4 beta 1 and 3.2.4 RC 4, but not sure it's significant enough to be worth looking into.
I've tracked down the major change I think between RC3 and RC4. In RC4 the error message is spamming about using an input map that isn't set, and in RC3 that error message isn't there.
Oo beaten to it!
When I remove those inputs, the fps is the same in RC3 and RC4.
I agree it seems possibly related to the big changes in lightmapping and the project use of lightmap, and if this is happening between RC6 and RC7 this does match in the timeline.
@cybereality Could you try to fix the InputMap error locally and see if you still get a performance drop?
If so, can you try intermediate RCs to see if you can pinpoint which one(s) cause a performance drop? From my test it seems to be between RC 6 and RC 7 but maybe it would be different for your hardware.
@lawnjelly Can you reproduce the drop between RC 6 and RC 7?
Relevant changes to look into: https://github.com/godotengine/godot/compare/15ff752737c53a1727cbc011068afa15683509be...cca2637b9b9dcb16070eb50a69c601a5f076c683
Possible "culprit": https://github.com/godotengine/godot/commit/8d156d9e5c3ac1d8988de406b77f82b32bed946e
Ah, noob mistake. I didn't look at the console window (it was hidden behind the editor). After adding those input mappings, now performance is back to what it was before (60fps to 400fps). So it was definitely the standard out that was slowing it down.
Yes I get from 79fps (RC6) -> 55fps (RC7). I decreased the size of the window a bit so I could see the output, that's why it is faster than the previous figures.
Yes I get from 79fps (RC6) -> 55fps (RC7). I decreased the size of the window a bit so I could see the output, that's why it is faster than the previous figures.
...
Possible "culprit": 8d156d9
Confirmed that this commit explains the slight performance drop experienced on my hardware.
To my untrained eyes the shadow quality in 3.3 RC 7 might be slightly better indeed, but I'm not sure it's significant enough to warrant having default settings which lead to a performance reduction. CC @clayjohn @puchik
If I read this correctly with this commit we're going from a cubemap sampler size of 512 to 4096, i.e. 8 times bigger (or is that 8²?). Maybe we should reduce the default values?
However if I set rendering/quality/shadow_atlas/size
to 512 then I do get awful looking shadows.
So maybe the cubemap sample size should get its own config value that could default to e.g. 512 instead of reusing the value of rendering/quality/shadow_atlas/size
?
I retitled the issue to track the new issue that was identified with this MRP (regression from #47160 in 3.3 RC 7), since the other one was just the consequence of error spam.
For comparison:
3.3 RC 6, default settings (shadow atlas size 4096, cubemap sampler size 512):
3.3 RC 7, default settings (shadow atlas size 4096, cubemap sampler size 4096):
3.3 RC 7, shadow atlas size and cubemap sampler size 2048:
3.3 RC 7, shadow atlas size and cubemap sampler size 1024:
3.3 RC 7, shadow atlas size and cubemap smapler size 512:
So lowering the common setting to get back the 3.3 RC 6 performance is definitely not an option.
How does 1024 look?
I'm a little surprised that #47160 is causing such a performance regression. In my testing the difference was about 0.1 to 0.5 ms when using 32 dynamic lights with shadows.
That being said, we should consider reducing the default shadow atlas size to 2048 now. The quality is better than 4096 without #47160 and speed should be improved.
See also for further discussion. https://github.com/puchik/godot/pull/1
I updated https://github.com/godotengine/godot/issues/48036#issuecomment-823058246 to include screenshots using 3.3 RC 7 with shadow atlas size 2048 and 1024.
The quality on this MRP does suffer quite a few from reducing to 2048 if you compare to 3.3 RC 6:
RC6: Shadow atlas 4096, cubemap sampler 512:
RC7: Shadow atlas and cubemap sampler 2048:
@akien-mga thanks. That does indeed look like an unacceptable tradeoff.
I don't want want to revert #47160 entirely as it makes it possible to have much better shadow quality (at the expense of speed) than was possible in 3.2.
I will make a PR tonight that introduces a project setting for max_shadow_cubemap_sampler_size: https://github.com/godotengine/godot/blob/2ece8c44b8d26098570b8bf1b7727fad1e58159d/drivers/gles3/rasterizer_scene_gles3.cpp#L5113
We can default to 512 so that the quality/performance is the same in 3.3 as it was in 3.2 and then users who want more quality can increase it as they need.
That being said, we should consider reducing the default shadow atlas size to 2048 now. The quality is better than 4096 without #47160 and speed should be improved.
Sharper shadows aren't always better. Using a lower resolution introduces blurriness, but shadows in real life aren't perfectly sharp, especially when the shadow is far away from its caster.
In fact, I tend to find the default OmniLight/SpotLight shadows too sharp right now (when used at the typical OmniLight/SpotLight sizes). This is especially the case when you compare them to the default DirectionalLight shadow quality, which makes them look inconsistent.
Anyway, I think @clayjohn's solution is the way to go for 3.3.
I can confirm as well that #47160 caused a significant framerate drop on my project, which made it to go from roughly 50-ish fps (4 instances as I test multiplayer) to literally single digit fps (on a GTX 1060). I have however not touched any shadow settings, so if decreasing the default size results in roughly the same performance and quality as 3.2, this is probably fine.
Below is a quick comparison between the latest 3.2 commit vs 8d156d9, no project settings changed:
That's a very massive performance hit, that sounds surprising compared to what we were discussing above. Could you test 3.3 RC 6 and RC 7 to confirm that you get good performance in RC 6 and the drop in RC 7?
I just tried with RC6 and RC7, I can confirm I get the same performance drop on RC7 only. For what it's worth, I notice some really high CPU usage on a single thread with RC7 (I do have quite a few Tweens at work), which doesn't happen with RC6. My system specs below just in case:
OS: Linux Mint 20.1 (kernel is 5.4.0) CPU: Intel i7-8700K GPU: Nvidia GTX 1060 6GB RAM: 16 GB, most likely irrelevant here
Below is a screenshot of the framerate monitor in RC7 (the two dips to ~15 fps correspond to the animations). It doesn't seem to go down quite as low as what the above video shows, probably because of averaged values. In RC6, framerate drops to 30-40 during the animations.
My issue may be a somewhat extreme case where high CPU usage causes framerate to drop more than it should, but RC7 definitely did impact framerate.
We can default to 512 so that the quality/performance is the same in 3.3 as it was in 3.2 and then users who want more quality can increase it as they need.
This is the way to go. It's possible the large frame drops are a combination of things but the shadow resolution would definitely contribute to it. For my project having the cubemap capped to 512 looked terrible (indoor environment with low lighting) and having it increased caused minimal drops.
Funny enough, running on my hardware I can't reproduce a performance drop between RC 6 and RC 7. In both cases performance is highly variable for the first 10 seconds or so and then plateaus at about 125 FPS. Note, I am not re-baking the lightmap between runs which shouldn't matter if 8d156d9 is indeed the culprit.
I am going to test on my laptop using integrated graphics to see if I can reproduce there.
Fixed by #48059.
@Cykyrios I'm not sure yet if your performance regression was exactly the same as the one fixed by #48059, or if it's that one + something else that also happened in RC 7. I'll make you a Linux build quickly to test.
It could be that the shadow map resolution problem is affecting @Cykyrios much more, because he is using split screen. 4 screens would seem to be 4x as many shadow map renders, for the same final screen area. That said there still could be other changes in RC7.
@Cykyrios I'm not sure yet if your performance regression was exactly the same as the one fixed by #48059, or if it's that one + something else that also happened in RC 7. I'll make you a Linux build quickly to test.
Could you test this build? It contains #48059 so if you use default shadow map settings, it should behave the same as 3.2.3 / 3.3 RC 6.
https://downloads.tuxfamily.org/godotengine/testing/Godot_v3.3-rc.d91b780_x11.64.zip
@akien-mga This build does have the same performance as RC6 for me, and I also compiled from the latest commit just in case, same result. Thanks a lot to you and @clayjohn!
Edit: Actually I just increased the cubemap size from the default 512 to 2048 (shadow atlas size was 8192 already) and I still get the same performance as RC6.
Godot version:
v3.3 RC9
OS/device including version:
Windows 10, RTX 2080 Ti, Driver 466.11, GLES3
Issue description:
In several small test projects, the performance has dropped significantly. Typically I would get around 160 fps with V-Sync, now it has dropped to around 60 - 70 fps (and this is with test apps with only a few cubes/planes, not even a full game). I have went back and tested the same projects in v3.2.4 Beta6 and performance is fine. Note, I have a few other Godot projects where performance is still great, so I don't think it's my machine. But something there is causing a performance issue.
Steps to reproduce: I've attached a test project (on Google Drive, too big to upload here). Try opening it in the latest RC, and see if you can reproduce my issue. This is not a demanding scene, just a few simple models.
Minimal reproduction project:
https://drive.google.com/file/d/1hAv2FRAHCVT3p25Upduhm_auqcB5vudT/view?usp=sharing