godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
89.22k stars 20.23k forks source link

Very slow project/editor opening in Godot 4 #43351

Open qarmin opened 3 years ago

qarmin commented 3 years ago

Godot version: 4.0.dev.custom_build. 494bf38c7 GLES 2 - tested 14.01.2020

OS/device including version: Ubuntu 20.04 Nvidia 970 440.83, i7 4770 4/8 Hotspot -v1.3.0-27-g61f1314

Issue description: Opening and closing empty project in editor with command godot -e -q take a lot of more time than previous versions.

2020 11 results RESULTS FROM 06.11.2020 |Version|Opening Time|Memory usage in editor| |-|-|-| |Godot 3.2 | 4.2s| 135 MB | |Godot 4.0 GLES 2| 8.2s | 403 MB | |Godot 4.0 (494bf38c7) | 13.2s| 340 MB | |Godot 3.2 + Sanitizers | 13.1s| - | |Godot 4.0 (494bf38c7) + Sanitizers | 41s| - |
25.07.2020 results RESULTS FROM 25.07.2020 |Version|Opening Time| |-|-| |Godot 3.4 (6d58ea6ce)| 5.2s | |Godot 3.4 (6d58ea6ce) + Sanitizers| 13.7s | |Godot 4.0 (b918c4c3c) - first run | 13.9s | |Godot 4.0 (b918c4c3c) - second run | 10.4s | |Godot 4.0 (b918c4c3c) + Sanitizers - first run| 53,4s | |Godot 4.0 (b918c4c3c) + Sanitizers - second run| 33s |

RESULTS FROM 28.11.2022 - opening and closing empty project in editor

Dev debug builds Version Opening Time
Godot 3.6 (6d58ea6ce) GLES3 5.2s
Godot 3.6 (6d58ea6ce) GLES3 + Sanitizers 13.7s
Godot 4.0 (71f5bb391) Vulkan - first run 47.4s
Godot 4.0 (71f5bb391) Vulkan - second run 27.4s
Godot 4.0 (71f5bb391) Headless 23s
Godot 4.0 (71f5bb391) Vulkan + Sanitizers - first run 129s
Godot 4.0 (71f5bb391) Vulkan + Sanitizers - second run 110s
optimized builds(downloaded from official site) Version Opening Time
Godot 3.5.1 GLES3 7.7s
Godot 4.0 beta 6 Vulkan 10.8s
Godot 4.0 beta 6 Headless 8.2s

This makes it very difficult to test and develop the engine, especially when you start it with Address Sanitizer for example.

Most time spends on compiling shaders, which seems for me that partially can be avoidable by caching it or even compiling it when compiling Godot(not sure if this is possible).

Running Valgrind in one small project take almost 10 minutes

2020 11 results CPU Cycles in Godot 3.2 ![Zrzut ekranu z 2020-11-06 09-58-37](https://user-images.githubusercontent.com/41945903/98349060-d3865700-2019-11eb-8122-589c771a3ea4.png) CPU Cycles in Godot 4.0 ![Zrzut ekranu z 2020-11-06 09-56-46](https://user-images.githubusercontent.com/41945903/98349180-fa448d80-2019-11eb-9b44-2b65585ae54b.png) CPU Cycles in Godot 4.0 GLES 2 ![Zrzut ekranu z 2021-01-14 14-52-31](https://user-images.githubusercontent.com/41945903/104599552-3cbcc300-5678-11eb-86af-276530c50b0b.png)

Graphs Godot 4 Beta 6 Dev Hadless:

Screenshot from 2022-11-28 08-56-29

Godot 4 Beta 6 Dev Vulkan Screenshot from 2022-11-28 09-19-58

Godot 3.6 - 6d58ea6ce GLES 3 Screenshot from 2022-11-28 09-30-51

Steps to reproduce: To get perf data just execute this command(path must be changed)

perf record -o /home/rafal/perf.data --call-graph dwarf --aio --sample-cpu /usr/bin/godot4 -e -q

And then open it in latest version of Hotspot- https://github.com/KDAB/hotspot/releases

akien-mga commented 3 years ago

I haven't grokked numbers specifically but I can confirm qualitatively that starting debug builds of Godot master takes much longer than 3.2 on my system:

$ inxi -CSG
System:    Host: cauldron Kernel: 5.9.3-desktop-2.mga8 x86_64 bits: 64 Desktop: KDE Plasma 5.20.2 Distro: Mageia 8 mga8 
CPU:       Info: Quad Core model: Intel Core i7-8705G bits: 64 type: MT MCP L2 cache: 8192 KiB 
           Speed: 900 MHz min/max: 800/3100 MHz Core speeds (MHz): 1: 900 2: 900 3: 900 4: 900 5: 900 6: 900 7: 900 8: 899 
Graphics:  Device-1: Intel HD Graphics 630 driver: i915 v: kernel 
           Device-2: Advanced Micro Devices [AMD/ATI] Polaris 22 XL [Radeon RX Vega M GL] driver: amdgpu v: kernel 
           Device-3: Cheng Uei Precision Industry (Foxlink) HP Wide Vision FHD Camera type: USB driver: uvcvideo 
           Display: x11 server: Mageia X.org 1.20.9 driver: intel,v4l resolution: 1920x1080~60Hz 
           OpenGL: renderer: Mesa Intel HD Graphics 630 (KBL GT2) v: 4.6 Mesa 20.2.1 
akien-mga commented 3 years ago

50% of the time seems to be spent compiling shaders. I guess this will improve once shader caching has been implemented, so only the first run will be so slow.

lawnjelly commented 3 years ago

I've just been having a little look at this too. On the plus side the startup time in GLES2 seems approx 2x as fast as vulkan, which will be helpful for people not working on graphics. As qarmin says the startup time is crucial for iteration for engine developers.

I noticed there were quite a few render targets being created, and I made these allocations deferred in GLES2 for faster startup, but I'll see if there is anything else that can be improved.

This is a profile of GLES2 startup: Screenshot from 2021-01-13 16-26-08

A lot of the startup time seems to be in either String stuff (in GLES2) or gslang in Vulkan. There is possibly some time being spent in locking operations, but I'm not confident enough to interpret the profile in linux (don't know what cycle 14 represents).

There may be some low hanging fruit in optimizing some of the string routines in debug, and possibly the doctool (maybe it can be cached).

nathanfranke commented 3 years ago

Caching would be great, but the shader compilation itself should be optimized too. An 8 second startup time is embarrassing, especially for new users.

(Also, can we get high priority label)

Calinou commented 3 years ago

Caching would be great, but the shader compilation itself should be optimized too. An 8 second startup time is embarrassing, especially for new users.

The master branch isn't production-ready, so being embarrassing is a small concern until we reach late beta stage :slightly_smiling_face:

(Also, can we get high priority label)

I prefer reserving that label to issues that impact the current stable release only. It's not like that label makes much of a difference anyway.

timothyqiu commented 3 years ago

I think the slow start impacts contribution.

From my perspective, now I don't check whether the bug is reproducible on master any more when submitting a bug report because creating a test project and edit means double the long wait time (project manager + editor).

It's also a pain to submit PRs, because one have to target master unless it's a 3.x only issue. So my workflow is to develop on 3.x and backport PRs to master, which seems inefficient, but saves much more time comparing to develop the PR on master.

Calinou commented 3 years ago

From my perspective, now I don't check whether the bug is reproducible on master any more when submitting a bug report because creating a test project and edit means double the long wait time (project manager + editor).

Use this command to create a new project in a temporary folder, bypassing the project manager:

# Assuming you're in the root folder of a Godot Git clone:
mkdir -p /tmp/a
touch /tmp/a/project.godot
bin/godot.x11.tools.64 /tmp/a/project.godot

This works with any Godot version. You can save a script like this one in your PATH to simplify this to a single command.

Calinou commented 3 years ago

I ran a profiler with https://github.com/godotengine/godot/pull/49050 merged locally. The project manager was opened with the --quit command line argument, which makes Godot exit after one frame was rendered:

Project manager with no projects in the list

Linux perf file (can be extracted and opened in HotSpot): perf.data.zip

Top-down with cycle-intensive methods unfolded

image

Flame graph

image

Caller/callee

image

With a few dozen projects in the project list

Linux perf file (can be extracted and opened in HotSpot): perf.data.zip

Top-down with cycle-intensive methods unfolded

image

Flame graph

image

It looks like mbedTLS of all things is causing the slowness, which is surprising at first. cc @Faless

Faless commented 3 years ago

It looks like mbedTLS of all thinsgs is causing the slowness, which is surprising at first. cc @Faless

Does it also happen in 3.x? The code should be the same as far as I recall.

I mean, the certificate list is big, but shouldn't take millis anyway. Does it get better if you set network/ssl/certificates editor settings to a single certificate (so the default list is used instead)?

akien-mga commented 3 years ago

Does it also happen in 3.x? The code should be the same as far as I recall.

https://github.com/godotengine/godot/issues/48978#issuecomment-846599528 shows the same for 3.x yeah. Earlier comparisons in this issue did not, but that must have been introduced in 3.x with the backport of the crypto stuff.

That would likely best be tracked in a separate issue for 3.x + 4.0.

Faless commented 3 years ago

#48978 (comment) shows the same for 3.x yeah. Earlier comparisons in this issue did not, but that must have been introduced in 3.x with the backport of the crypto stuff.

Good catch! I'll investigate.

reduz commented 3 years ago

How is this faring? Since shader caching and stringname optimizations were implemented, I have not seen update to these comparisons.

Calinou commented 3 years ago

I ran a profile on commit b5e5eed7a (target=debug), running the project manager with --quit again. I ran the project manager at least once before, so that it had some shader cache to work with.

Linux perf file (can be extracted and opened in HotSpot): perf.data.zip

## Project manager with no projects in the list ### Top-down with cycle-intensive methods unfolded ![image](https://user-images.githubusercontent.com/180032/126404931-4d3995ce-22bc-48b2-9aed-a023a4e46b50.png) ### Flame graph ![image](https://user-images.githubusercontent.com/180032/126404975-f2350b1b-7d45-43ff-9d23-b2520a999d4a.png) ### Timeline (top) ![image](https://user-images.githubusercontent.com/180032/126405030-da428de4-d369-49b7-8784-f1517beae47f.png) ### Timeline (bottom) ![image](https://user-images.githubusercontent.com/180032/126405041-89c7e6d6-7abb-43f6-b735-ec4ba3a687ec.png)
Calinou commented 3 years ago

With https://github.com/godotengine/godot/commit/b5e5eed7ae199a2bc5cf7fe7b08ed33b847bff25 and target=release_debug: perf.data.zip

### Top-down with cycle-intensive methods unfolded (top) ![image](https://user-images.githubusercontent.com/180032/126407015-a0a528e9-ffb1-4c51-baa6-ed4413759939.png) ### Top-down with cycle-intensive methods unfolded (bottom) ![image](https://user-images.githubusercontent.com/180032/126407048-2bcc6010-cd16-40e8-b019-54a0ea56af6f.png) ### Filtered top-down view ![image](https://user-images.githubusercontent.com/180032/126407297-8af67dfa-ac06-4adf-aa39-e6f22c675fdf.png) ### Flamegraph ![image](https://user-images.githubusercontent.com/180032/126407069-8309bd0b-d8d5-46b6-b74c-a56560e59d05.png) ### Timeline (top) ![image](https://user-images.githubusercontent.com/180032/126407087-1b82b123-b6d6-4c68-8a56-b4cd5f02027b.png) ### Timeline (bottom) ![image](https://user-images.githubusercontent.com/180032/126407109-35c4f47b-f2f0-47d3-b6b8-0a08838c21d3.png)
akien-mga commented 2 years ago

I think the huge startup time regression can be considered fixed now, so we can close this issue. I'd suggest opening a new enhancement issue with latest profiling data so that we can keep optimizing to go even faster.

nathanfranke commented 2 years ago

The start-up time is still painfully slow, even on an empty project.

Calinou commented 2 years ago

@nathanfranke Can you compare optimized editor builds of 3.x and master (same compiler and build flags)? I recommend using hyperfine with the --editor --quit Godot CLI arguments for benchmarking (or just --quit for the project manager).

Here, master takes significantly more time than 3.x, but I wouldn't say it's unusable either. Startup times in master are no longer a blocker for a 4.0alpha release at least.

We can improve startup time in both master and 3.x by fixing a mbedTLS issue related to SSL certificate parsing, but it's an upstream issue that we'd prefer waiting an official fix for (given it has security implications due to timing attacks).

nathanfranke commented 2 years ago

There is a slight increase in start-up time for 4.0 optimized compared to 3.x optimized, but it looks like the main regression is the debug (unoptimized) start-up times.

3.4 tests done on v3.4.beta.custom_build [8d0b2edfc] 4.0 tests done on v4.0.dev.custom_build [95432893e]

Task 3.4 Debug 3.4 Release 4.0 Debug 4.0 Release
Project Manager (Initial) 2.201 s ± 0.029 s 1.491 s ± 0.009 s 4.131 s ± 0.023 s 1.558 s ± 0.053 s
Project Manager (Subsequent) 2.222 s ± 0.037 s 1.505 s ± 0.062 s 2.112 s ± 0.045 s 1.109 s ± 0.065 s
Editor (Initial) 6.417 s ± 0.045 s 3.751 s ± 0.013 s 13.195 s ± 0.372 s 5.269 s ± 0.462 s
Editor (Subsequent) 6.411 s ± 0.119 s 3.777 s ± 0.092 s 8.222 s ± 0.426 s 4.444 s ± 0.322 s
Empty Project 2.164 s ± 0.135 s 1.497 s ± 0.106 s 2.121 s ± 0.317 s 1.599 s ± 0.091 s
TPS Demo* 4.127 s ± 0.373 s 11.882 s ± 0.332 s

* Time is aggregated from these steps: Load Main Menu, Load Main Scene, Wait 3 Frames, Exit Game.

clayjohn commented 2 years ago

@nathanfranke Are those timings for an empty project? I have a feeling the performance has regressed more for complex projects.

nathanfranke commented 2 years ago

Empty project. Is there a good project to test this on that has 3 and 4 versions?

clayjohn commented 2 years ago

@nathanfranke maybe the TPS demo?

nathanfranke commented 2 years ago

TPS Demo does not work on 4.0 yet. However I still did the timings for 3.x.

nathanfranke commented 2 years ago
Task 3.5.rc4 4.0.alpha10
Project Manager (Subsequent) 0.68s 1.13s
Editor (Subsequent) 2.15s 7.40s
Close Editor 1.11s 1.54s

https://user-images.githubusercontent.com/14253836/174902416-2261b0f5-b357-442a-8833-c1132d35fe95.mp4

nathanfranke commented 2 years ago

Pointing out something I noticed the past few days... When I run my project standalone (e.g. --path my/project) it starts in like a second. However, if I run another instance right after, it takes like 5-6 seconds to start. When using the editor, the project seems to always take 5-6 seconds. There could be a bug with the shader caching with multiple instances.

Edit: Reported at https://github.com/godotengine/godot/issues/63419

ffernn-dev commented 2 years ago

Not sure if I should be putting this here or in #62322 but if I'm understanding this correctly and used the Instruments software properly (I really don't know what I'm doing) it looks like it's just doing a whole lot of nothing. Macbook pro late 2013. It takes around 45 seconds to launch the project list, another 45 to launch the editor, and 45 more when I run the project to test. This could be entirely unrelated but I hope this helps because launcher v3.4.4 only takes 2 seconds max.

Screen Shot 2022-08-08 at 8 20 18 pm

edit I know it's alpha not beta that's my bad

filipworksdev commented 1 year ago

I am on Iris Xe and I experience slowdowns when opening Godot and when running the project and when opening project or editor settings but only on multi-window. If I switch back to single window it's much better but still not as snappy as Godot 3 used to be.

desttinghim commented 1 year ago

I am on Iris Xe and I experience slowdowns when opening Godot and when running the project and when opening project or editor settings but only on multi-window. If I switch back to single window it's much better but still not as snappy as Godot 3 used to be.

I am having this same issue, and switching to single window also helps with performance one my end. I am running NixOS on a 11th Gen Intel chip, specifically the Framework laptop.