godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
88.8k stars 20.14k forks source link

Godot 4.0 is very slow to start on macOS compared to 3.x #62322

Closed falconepl closed 1 year ago

falconepl commented 2 years ago

Godot version

4.0.alpha10.official

System information

macOS 12.4 (Monterey), Vulkan, Intel Iris Pro 5200

Issue description

Latest release of Godot 4.0 (alpha 10) seems to be super slow on macOS 12.4 (Monterey), at least when tested on my MacBook Pro setup (meanwhile, Godot 3.4.4 works without any noticeable performance issues). Here are some benchmarks:

On the other hand, editor itself is generally responsive when interacting with its UI (creating new nodes etc.).

Among important details, here are some errors that are raised during the 2D application start (empty scene):

E 0:00:00:0276   _debug_messenger_callback:  - Message Id Number: 0 | Message Id Name: 
    VK_ERROR_INITIALIZATION_FAILED: Render pipeline compile failed (Error code 2):
Compiler encountered an internal error.
    Objects - 1
        Object[0] - VK_OBJECT_TYPE_PIPELINE, Handle 140389825172992
  <C++ Source>   drivers/vulkan/vulkan_context.cpp:159 @ _debug_messenger_callback()

E 0:00:00:0276   render_pipeline_create: vkCreateGraphicsPipelines failed with error -3 for shader 'ClusterRenderShaderRD:0'.
  <C++ Error>    Condition "err" is true. Returning: RID()
  <C++ Source>   drivers/vulkan/rendering_device_vulkan.cpp:6614 @ render_pipeline_create()

E 0:00:30:0395   _debug_messenger_callback:  - Message Id Number: 0 | Message Id Name: 
    VK_ERROR_INITIALIZATION_FAILED: Render pipeline compile failed (Error code 2):
Compiler encountered an internal error.
    Objects - 1
        Object[0] - VK_OBJECT_TYPE_PIPELINE, Handle 140389832876544
  <C++ Source>   drivers/vulkan/vulkan_context.cpp:159 @ _debug_messenger_callback()

E 0:00:30:0395   render_pipeline_create: vkCreateGraphicsPipelines failed with error -3 for shader 'ClusterRenderShaderRD:0'.
  <C++ Error>    Condition "err" is true. Returning: RID()
  <C++ Source>   drivers/vulkan/rendering_device_vulkan.cpp:6614 @ render_pipeline_create()

Also, here's the editor output after Godot start, for the record:

--- Debugging process started ---
Godot Engine v4.0.alpha10.official.4bbe7f0b9 - https://godotengine.org
Vulkan API 1.1.198 - Using Vulkan Device #0: Intel - Intel Iris Pro Graphics

Registered camera FaceTime HD Camera with id 1 position 0 at index 0
--- Debugging process stopped ---

It seems that Vulkan support is not an issue on my MacBook Pro. I've just installed Vulkan SDK 1.3.216.0 for macOS from LunarG website and their vkcube.app demo works without any issues (thanks to MoltenVK underneath, of course).

I can provide any additional information or testing if that would help.

System details

GPU details (from system_profiler)

Chipset Model: Intel Iris Pro
Type: GPU
Bus: Built-In
VRAM (Dynamic, Max): 1536 MB
Vendor: Intel
Device ID: 0x0d26
Revision ID: 0x0008
Metal Family: Supported, Metal GPUFamily macOS 1

And by the way, thank you all for Godot! 🎉 It's absolutely amazing that a game engine like this is available out there.

Steps to reproduce

No extra steps are needed to reproduce the issue. It seems that any MacBook with a similar spec would be prone to this problem.

Minimal reproduction project

No response

Calinou commented 2 years ago

Can you reproduce this with previous 4.0 alphas?

Note that in general, startup speeds are in 4.0 expected to be slower than 3.x, but they shouldn't be more than twice as slow. However, the first startup will always take significantly more time due to the shader cache having to be warmed. On slower hardware, this can indeed require 20-30 seconds on the first startup, but subsequent starts of the project manager, editor and project (on a per-project basis) should be faster until you update your graphics driver.

nathanfranke commented 2 years ago

I really want to see some improvement with start-up time. Keep in mind the first start-up is the first impression users have with Godot. Going from ~5 seconds to 30 seconds lessens my perception of Godot as a light-weight engine.

falconepl commented 2 years ago

@Calinou Thanks for the comment!

I think that when it comes to this particular setup (MacBook Pro with an Intel Iris Pro 5200 integrated GPU) the performance issue is more serious than in other similar cases. I think I haven't highlighted it clearly - that 30 seconds slowdown during the Godot editor start as well as the startup of an app/project (including an empty scene) happens every single time 🙂

Namely, when I run the app with a play button for the first time, it takes the app about 30 seconds to start. But then, when it's closed and started again, it still takes 30 seconds to start. Even after the editor restart or the OS restart the issue remains. My guess is that some caches (like the shader cache you've mentioned) are not used at all, so they're never warmed up.

Also, I think that the error log I've posted in the first comment could give us some clue. Mainly because I've tested Godot 4.0 alpha 10 on the other laptop as well, a newer MacBook Pro (Touch Bar, 16-inch, Late 2019) and there's no issue with the permanent slow startup. I mean, the first run of the editor is about 30 seconds as well, but then it pops up quickly, just in a few seconds. Also the app/project startup is not an issue. The first run was slow but the consecutive restarts are almost instantaneous, like they were in Godot 3.4.

For the record, that newer MacBook has the following spec:

MacBook Pro (Retina Mid 2015) may seem pretty old when compared to it, but it was discontinued not that long ago. Apple was selling those Retina laptops until 2018.

Oh, and by the way... I've also tested Godot 4.0 alpha 1 and Godot 4.0 alpha 9 with my older MacBook Pro (Retina) as well. Unfortunately, there is no difference between those alphas and the current one, when it becomes to that slow startup issues. I've cleared all Godot-related files (including system caches and other files mentioned in the Homebrew formula's "zap trash") and restarted the OS (macOS loads some caches during every system start) before every new alpha version installation, so I guess all these benchmarks are valid.

I don't have experience in Vulkan API (I've done some quick tutorial once) and I haven't written any serious C++ for ages now, but if there's any way I could help with this issue - please let me know. I will try my best. Thanks!

RabbitB commented 2 years ago

I think we should profile the startup and see if there's any easy gains or fixes to be made there. Gut feeling says the new renderer alone shouldn't add that much startup time, especially not under Vulkan.

Calinou commented 2 years ago

I ran benchmarks of all versions since Godot 3.0.1 up to 4.0.alpha10:

Specifications

OS: Fedora 36 CPU: Intel Core i7-6700K @ 4.4 GHz RAM: 2×16 GB DDR4-3000 (dual channel) GPU: NVIDIA GeForce GTX 1080 (NVIDIA 510.68.02) SSD: Samsung 850 EVO 1 TB

Using official Godot Linux x86_64 release binaries downloaded from TuxFamily. An average of 25 runs is performed.

Benchmark source and data files (CSV, JSON, Markdown) are available at: https://github.com/Calinou/godot-startup-times

Results

Cold project manager start + exit

editor_data/ and project folder are removed before every run.

| Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | `Godot_v3.0.1-stable` | 1.150 ± 0.003 | 1.145 | 1.158 | 1.23 ± 0.03 | | `Godot_v3.0.2-stable` | 1.151 ± 0.004 | 1.146 | 1.160 | 1.23 ± 0.03 | | `Godot_v3.0.3-stable` | 1.144 ± 0.003 | 1.139 | 1.150 | 1.23 ± 0.03 | | `Godot_v3.0.4-stable` | 1.144 ± 0.003 | 1.138 | 1.153 | 1.23 ± 0.03 | | `Godot_v3.0.5-stable` | 1.146 ± 0.004 | 1.140 | 1.156 | 1.23 ± 0.03 | | `Godot_v3.0.6-stable` | 1.145 ± 0.004 | 1.140 | 1.160 | 1.23 ± 0.03 | | `Godot_v3.1-stable` | 1.212 ± 0.010 | 1.201 | 1.240 | 1.30 ± 0.03 | | `Godot_v3.1.1-stable` | 1.209 ± 0.008 | 1.202 | 1.242 | 1.30 ± 0.03 | | `Godot_v3.1.2-stable` | 1.210 ± 0.004 | 1.204 | 1.219 | 1.30 ± 0.03 | | `Godot_v3.2-stable` | 1.217 ± 0.013 | 1.201 | 1.271 | 1.30 ± 0.03 | | `Godot_v3.2.1-stable` | 1.205 ± 0.008 | 1.197 | 1.230 | 1.29 ± 0.03 | | `Godot_v3.2.2-stable` | 1.210 ± 0.012 | 1.195 | 1.239 | 1.30 ± 0.03 | | `Godot_v3.2.3-stable` | 1.208 ± 0.012 | 1.192 | 1.231 | 1.29 ± 0.03 | | `Godot_v3.3-stable` | 1.860 ± 0.405 | 1.404 | 2.222 | 1.99 ± 0.44 | | `Godot_v3.3.1-stable` | 1.793 ± 0.406 | 1.403 | 2.218 | 1.92 ± 0.44 | | `Godot_v3.3.2-stable` | 1.752 ± 0.415 | 1.356 | 2.220 | 1.88 ± 0.45 | | `Godot_v3.3.3-stable` | 1.771 ± 0.430 | 1.356 | 2.216 | 1.90 ± 0.46 | | `Godot_v3.3.4-stable` | 1.857 ± 0.405 | 1.404 | 2.217 | 1.99 ± 0.44 | | `Godot_v3.4-stable` | 1.925 ± 0.427 | 1.308 | 2.225 | 2.06 ± 0.46 | | `Godot_v3.4.1-stable` | 1.815 ± 0.456 | 1.305 | 2.220 | 1.95 ± 0.49 | | `Godot_v3.4.2-stable` | 1.656 ± 0.428 | 1.305 | 2.224 | 1.78 ± 0.46 | | `Godot_v3.4.3-stable` | 1.888 ± 0.440 | 1.305 | 2.221 | 2.02 ± 0.47 | | `Godot_v3.4.4-stable` | 1.707 ± 0.455 | 1.306 | 2.218 | 1.83 ± 0.49 | | `Godot_v3.5-beta1` | 1.852 ± 0.452 | 1.307 | 2.225 | 1.98 ± 0.49 | | `Godot_v3.5-beta2` | 1.744 ± 0.458 | 1.306 | 2.214 | 1.87 ± 0.49 | | `Godot_v3.5-beta3` | 1.900 ± 0.307 | 1.603 | 2.222 | 2.04 ± 0.33 | | `Godot_v3.5-beta4` | 1.948 ± 0.305 | 1.606 | 2.225 | 2.09 ± 0.33 | | `Godot_v3.5-beta5` | 1.900 ± 0.308 | 1.606 | 2.235 | 2.04 ± 0.33 | | `Godot_v3.5-rc1` | 1.877 ± 0.305 | 1.605 | 2.222 | 2.01 ± 0.33 | | `Godot_v3.5-rc2` | 1.899 ± 0.307 | 1.603 | 2.225 | 2.03 ± 0.33 | | `Godot_v3.5-rc3` | 1.902 ± 0.304 | 1.605 | 2.217 | 2.04 ± 0.33 | | `Godot_v3.5-rc4` | 2.118 ± 0.309 | 1.773 | 2.430 | 2.27 ± 0.33 | | `Godot_v4.0-dev.20210727` | 0.987 ± 0.022 | 0.960 | 1.054 | 1.06 ± 0.03 | | `Godot_v4.0-dev.20210811` | 0.996 ± 0.017 | 0.970 | 1.035 | 1.07 ± 0.03 | | `Godot_v4.0-dev.20210820` | 0.994 ± 0.028 | 0.974 | 1.124 | 1.07 ± 0.04 | | `Godot_v4.0-dev.20210916` | 1.001 ± 0.016 | 0.974 | 1.052 | 1.07 ± 0.03 | | `Godot_v4.0-dev.20210924` | 1.047 ± 0.037 | 0.993 | 1.147 | 1.12 ± 0.05 | | `Godot_v4.0-dev.20211004` | 1.067 ± 0.014 | 1.048 | 1.103 | 1.14 ± 0.03 | | `Godot_v4.0-dev.20211015` | 1.079 ± 0.017 | 1.060 | 1.122 | 1.16 ± 0.03 | | `Godot_v4.0-dev.20211027` | 0.955 ± 0.008 | 0.937 | 0.973 | 1.02 ± 0.02 | | `Godot_v4.0-dev.20211108` | 0.944 ± 0.017 | 0.916 | 0.988 | 1.01 ± 0.03 | | `Godot_v4.0-dev.20211117` | 0.946 ± 0.014 | 0.925 | 0.984 | 1.01 ± 0.03 | | `Godot_v4.0-dev.20211210` | 0.961 ± 0.013 | 0.936 | 0.989 | 1.03 ± 0.03 | | `Godot_v4.0-dev.20220105` | 0.987 ± 0.015 | 0.961 | 1.012 | 1.06 ± 0.03 | | `Godot_v4.0-dev.20220118` | 0.995 ± 0.018 | 0.957 | 1.040 | 1.07 ± 0.03 | | `Godot_v4.0-alpha1` | 0.983 ± 0.014 | 0.955 | 1.020 | 1.05 ± 0.03 | | `Godot_v4.0-alpha2` | 0.933 ± 0.020 | 0.903 | 1.002 | **1.00 (fastest)** | | `Godot_v4.0-alpha3` | 1.063 ± 0.024 | 1.005 | 1.092 | 1.14 ± 0.04 | | `Godot_v4.0-alpha4` | 1.073 ± 0.021 | 1.037 | 1.121 | 1.15 ± 0.03 | | `Godot_v4.0-alpha5` | 1.067 ± 0.017 | 1.039 | 1.112 | 1.14 ± 0.03 | | `Godot_v4.0-alpha6` | 1.496 ± 0.016 | 1.469 | 1.531 | 1.60 ± 0.04 | | `Godot_v4.0-alpha7` | 1.447 ± 0.014 | 1.422 | 1.481 | 1.55 ± 0.04 | | `Godot_v4.0-alpha8` | 1.451 ± 0.017 | 1.428 | 1.493 | 1.55 ± 0.04 | | `Godot_v4.0-alpha9` | 1.460 ± 0.016 | 1.440 | 1.500 | 1.56 ± 0.04 | | `Godot_v4.0-alpha10` | 1.934 ± 0.023 | 1.910 | 2.001 | 2.07 ± 0.05 |

Warm project manager start + exit

| Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | `Godot_v3.0.1-stable` | 1.588 ± 0.507 | 1.140 | 2.153 | 2.86 ± 0.93 | | `Godot_v3.0.2-stable` | 1.431 ± 0.458 | 1.143 | 2.155 | 2.58 ± 0.84 | | `Godot_v3.0.3-stable` | 1.942 ± 0.409 | 1.135 | 2.154 | 3.50 ± 0.76 | | `Godot_v3.0.4-stable` | 1.904 ± 0.435 | 1.139 | 2.151 | 3.43 ± 0.81 | | `Godot_v3.0.5-stable` | 1.744 ± 0.498 | 1.139 | 2.150 | 3.14 ± 0.91 | | `Godot_v3.0.6-stable` | 1.864 ± 0.459 | 1.138 | 2.151 | 3.36 ± 0.85 | | `Godot_v3.1-stable` | 1.498 ± 0.454 | 1.208 | 2.217 | 2.70 ± 0.83 | | `Godot_v3.1.1-stable` | 1.305 ± 0.275 | 1.209 | 2.221 | 2.35 ± 0.51 | | `Godot_v3.1.2-stable` | 1.266 ± 0.204 | 1.205 | 2.244 | 2.28 ± 0.39 | | `Godot_v3.2-stable` | 1.215 ± 0.012 | 1.199 | 1.250 | 2.19 ± 0.13 | | `Godot_v3.2.1-stable` | 1.216 ± 0.012 | 1.201 | 1.249 | 2.19 ± 0.13 | | `Godot_v3.2.2-stable` | 1.209 ± 0.011 | 1.196 | 1.238 | 2.18 ± 0.12 | | `Godot_v3.2.3-stable` | 1.213 ± 0.012 | 1.196 | 1.238 | 2.18 ± 0.13 | | `Godot_v3.3-stable` | 1.857 ± 0.404 | 1.406 | 2.215 | 3.34 ± 0.75 | | `Godot_v3.3.1-stable` | 1.922 ± 0.391 | 1.403 | 2.220 | 3.46 ± 0.73 | | `Godot_v3.3.2-stable` | 1.911 ± 0.405 | 1.358 | 2.213 | 3.44 ± 0.75 | | `Godot_v3.3.3-stable` | 1.776 ± 0.426 | 1.358 | 2.228 | 3.20 ± 0.79 | | `Godot_v3.3.4-stable` | 1.889 ± 0.400 | 1.405 | 2.217 | 3.40 ± 0.75 | | `Godot_v3.4-stable` | 1.743 ± 0.457 | 1.306 | 2.219 | 3.14 ± 0.84 | | `Godot_v3.4.1-stable` | 1.853 ± 0.449 | 1.305 | 2.221 | 3.34 ± 0.83 | | `Godot_v3.4.2-stable` | 1.745 ± 0.457 | 1.306 | 2.223 | 3.14 ± 0.84 | | `Godot_v3.4.3-stable` | 1.779 ± 0.459 | 1.306 | 2.219 | 3.20 ± 0.85 | | `Godot_v3.4.4-stable` | 1.742 ± 0.459 | 1.307 | 2.218 | 3.14 ± 0.85 | | `Godot_v3.5-beta1` | 1.888 ± 0.441 | 1.307 | 2.218 | 3.40 ± 0.82 | | `Godot_v3.5-beta2` | 1.747 ± 0.457 | 1.307 | 2.226 | 3.15 ± 0.84 | | `Godot_v3.5-beta3` | 1.949 ± 0.305 | 1.604 | 2.220 | 3.51 ± 0.58 | | `Godot_v3.5-beta4` | 1.925 ± 0.306 | 1.606 | 2.218 | 3.47 ± 0.58 | | `Godot_v3.5-beta5` | 1.924 ± 0.306 | 1.607 | 2.222 | 3.46 ± 0.58 | | `Godot_v3.5-rc1` | 1.948 ± 0.305 | 1.605 | 2.219 | 3.51 ± 0.58 | | `Godot_v3.5-rc2` | 1.902 ± 0.308 | 1.606 | 2.228 | 3.42 ± 0.59 | | `Godot_v3.5-rc3` | 1.853 ± 0.301 | 1.608 | 2.221 | 3.34 ± 0.57 | | `Godot_v3.5-rc4` | 2.173 ± 0.299 | 1.779 | 2.446 | 3.91 ± 0.58 | | `Godot_v4.0-dev.20210727` | 0.600 ± 0.018 | 0.588 | 0.665 | 1.08 ± 0.07 | | `Godot_v4.0-dev.20210811` | 0.600 ± 0.010 | 0.588 | 0.637 | 1.08 ± 0.06 | | `Godot_v4.0-dev.20210820` | 0.610 ± 0.019 | 0.595 | 0.679 | 1.10 ± 0.07 | | `Godot_v4.0-dev.20210916` | 0.614 ± 0.027 | 0.596 | 0.690 | 1.11 ± 0.08 | | `Godot_v4.0-dev.20210924` | 0.660 ± 0.013 | 0.645 | 0.688 | 1.19 ± 0.07 | | `Godot_v4.0-dev.20211004` | 0.652 ± 0.005 | 0.646 | 0.669 | 1.17 ± 0.07 | | `Godot_v4.0-dev.20211015` | 0.660 ± 0.013 | 0.646 | 0.688 | 1.19 ± 0.07 | | `Godot_v4.0-dev.20211027` | 0.574 ± 0.027 | 0.551 | 0.641 | 1.03 ± 0.08 | | `Godot_v4.0-dev.20211108` | 0.558 ± 0.012 | 0.551 | 0.610 | 1.00 ± 0.06 | | `Godot_v4.0-dev.20211117` | 0.572 ± 0.026 | 0.551 | 0.629 | 1.03 ± 0.07 | | `Godot_v4.0-dev.20211210` | 0.569 ± 0.020 | 0.554 | 0.639 | 1.02 ± 0.07 | | `Godot_v4.0-dev.20220105` | 0.585 ± 0.030 | 0.561 | 0.652 | 1.05 ± 0.08 | | `Godot_v4.0-dev.20220118` | 0.571 ± 0.012 | 0.562 | 0.621 | 1.03 ± 0.06 | | `Godot_v4.0-alpha1` | 0.573 ± 0.013 | 0.562 | 0.621 | 1.03 ± 0.06 | | `Godot_v4.0-alpha2` | 0.555 ± 0.031 | 0.517 | 0.605 | **1.00 (fastest)** | | `Godot_v4.0-alpha3` | 0.635 ± 0.008 | 0.624 | 0.658 | 1.14 ± 0.07 | | `Godot_v4.0-alpha4` | 0.635 ± 0.009 | 0.624 | 0.666 | 1.14 ± 0.07 | | `Godot_v4.0-alpha5` | 0.637 ± 0.013 | 0.623 | 0.673 | 1.15 ± 0.07 | | `Godot_v4.0-alpha6` | 1.038 ± 0.013 | 1.026 | 1.085 | 1.87 ± 0.11 | | `Godot_v4.0-alpha7` | 0.988 ± 0.010 | 0.976 | 1.018 | 1.78 ± 0.10 | | `Godot_v4.0-alpha8` | 0.993 ± 0.022 | 0.976 | 1.063 | 1.79 ± 0.11 | | `Godot_v4.0-alpha9` | 0.991 ± 0.009 | 0.981 | 1.012 | 1.79 ± 0.10 | | `Godot_v4.0-alpha10` | 1.353 ± 0.014 | 1.340 | 1.399 | 2.44 ± 0.14 |

Cold editor start + exit

editor_data/ and project folder are removed before every run.

| Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | `Godot_v3.0.1-stable` | 2.156 ± 0.005 | 2.149 | 2.170 | 1.00 ± 0.00 | | `Godot_v3.0.2-stable` | 2.156 ± 0.004 | 2.150 | 2.166 | 1.00 ± 0.00 | | `Godot_v3.0.3-stable` | 2.150 ± 0.005 | 2.145 | 2.171 | 1.00 ± 0.00 | | `Godot_v3.0.4-stable` | 2.150 ± 0.004 | 2.144 | 2.157 | 1.00 ± 0.00 | | `Godot_v3.0.5-stable` | 2.150 ± 0.004 | 2.145 | 2.159 | 1.00 ± 0.00 | | `Godot_v3.0.6-stable` | 2.150 ± 0.003 | 2.145 | 2.161 | **1.00 (fastest)** | | `Godot_v3.1-stable` | 2.219 ± 0.004 | 2.215 | 2.232 | 1.03 ± 0.00 | | `Godot_v3.1.1-stable` | 2.217 ± 0.005 | 2.212 | 2.236 | 1.03 ± 0.00 | | `Godot_v3.1.2-stable` | 2.229 ± 0.005 | 2.221 | 2.239 | 1.04 ± 0.00 | | `Godot_v3.2-stable` | 2.219 ± 0.006 | 2.207 | 2.229 | 1.03 ± 0.00 | | `Godot_v3.2.1-stable` | 2.218 ± 0.003 | 2.208 | 2.222 | 1.03 ± 0.00 | | `Godot_v3.2.2-stable` | 2.220 ± 0.004 | 2.214 | 2.230 | 1.03 ± 0.00 | | `Godot_v3.2.3-stable` | 2.731 ± 0.005 | 2.717 | 2.740 | 1.27 ± 0.00 | | `Godot_v3.3-stable` | 3.812 ± 0.026 | 3.786 | 3.857 | 1.77 ± 0.01 | | `Godot_v3.3.1-stable` | 3.798 ± 0.021 | 3.744 | 3.852 | 1.77 ± 0.01 | | `Godot_v3.3.2-stable` | 3.794 ± 0.030 | 3.686 | 3.840 | 1.77 ± 0.01 | | `Godot_v3.3.3-stable` | 3.769 ± 0.167 | 2.996 | 3.855 | 1.75 ± 0.08 | | `Godot_v3.3.4-stable` | 3.758 ± 0.186 | 2.893 | 3.850 | 1.75 ± 0.09 | | `Godot_v3.4-stable` | 3.811 ± 0.051 | 3.693 | 3.897 | 1.77 ± 0.02 | | `Godot_v3.4.1-stable` | 3.758 ± 0.235 | 2.991 | 3.893 | 1.75 ± 0.11 | | `Godot_v3.4.2-stable` | 3.783 ± 0.170 | 2.996 | 3.891 | 1.76 ± 0.08 | | `Godot_v3.4.3-stable` | 3.826 ± 0.045 | 3.739 | 3.894 | 1.78 ± 0.02 | | `Godot_v3.4.4-stable` | 3.827 ± 0.055 | 3.690 | 3.899 | 1.78 ± 0.03 | | `Godot_v3.5-beta1` | 3.796 ± 0.174 | 2.997 | 3.892 | 1.77 ± 0.08 | | `Godot_v3.5-beta2` | 3.875 ± 0.036 | 3.750 | 3.907 | 1.80 ± 0.02 | | `Godot_v3.5-beta3` | 4.154 ± 0.046 | 3.997 | 4.206 | 1.93 ± 0.02 | | `Godot_v3.5-beta4` | 3.083 ± 0.200 | 2.141 | 3.141 | 1.43 ± 0.09 | | `Godot_v3.5-beta5` | 3.053 ± 0.262 | 2.189 | 3.150 | 1.42 ± 0.12 | | `Godot_v3.5-rc1` | 3.053 ± 0.270 | 2.134 | 3.147 | 1.42 ± 0.13 | | `Godot_v3.5-rc2` | 3.086 ± 0.193 | 2.185 | 3.145 | 1.44 ± 0.09 | | `Godot_v3.5-rc3` | 3.373 ± 0.116 | 3.132 | 3.446 | 1.57 ± 0.05 | | `Godot_v3.5-rc4` | 3.567 ± 0.172 | 2.841 | 3.661 | 1.66 ± 0.08 | | `Godot_v4.0-dev.20210727` | 4.222 ± 0.380 | 3.798 | 4.726 | 1.96 ± 0.18 | | `Godot_v4.0-dev.20210811` | 4.217 ± 0.370 | 3.806 | 4.741 | 1.96 ± 0.17 | | `Godot_v4.0-dev.20210820` | 4.132 ± 0.357 | 3.427 | 4.502 | 1.92 ± 0.17 | | `Godot_v4.0-dev.20210916` | 4.268 ± 0.327 | 3.815 | 4.778 | 1.99 ± 0.15 | | `Godot_v4.0-dev.20210924` | 4.267 ± 0.365 | 3.833 | 4.813 | 1.98 ± 0.17 | | `Godot_v4.0-dev.20211004` | 4.105 ± 0.340 | 3.819 | 4.827 | 1.91 ± 0.16 | | `Godot_v4.0-dev.20211015` | 4.497 ± 0.378 | 3.889 | 4.830 | 2.09 ± 0.18 | | `Godot_v4.0-dev.20211027` | 4.202 ± 0.366 | 3.831 | 4.808 | 1.95 ± 0.17 | | `Godot_v4.0-dev.20211108` | 4.196 ± 0.373 | 3.849 | 4.802 | 1.95 ± 0.17 | | `Godot_v4.0-dev.20211117` | 4.312 ± 0.442 | 3.828 | 4.808 | 2.01 ± 0.21 | | `Godot_v4.0-dev.20211210` | 4.358 ± 0.439 | 3.877 | 4.876 | 2.03 ± 0.20 | | `Godot_v4.0-dev.20220105` | 4.389 ± 0.440 | 3.775 | 4.898 | 2.04 ± 0.20 | | `Godot_v4.0-dev.20220118` | 4.374 ± 0.409 | 3.853 | 4.807 | 2.03 ± 0.19 | | `Godot_v4.0-alpha1` | 4.328 ± 0.423 | 3.829 | 4.792 | 2.01 ± 0.20 | | `Godot_v4.0-alpha2` | 4.493 ± 0.387 | 3.903 | 4.807 | 2.09 ± 0.18 | | `Godot_v4.0-alpha3` | 4.493 ± 0.376 | 3.975 | 4.839 | 2.09 ± 0.18 | | `Godot_v4.0-alpha4` | 4.394 ± 0.410 | 3.736 | 4.841 | 2.04 ± 0.19 | | `Godot_v4.0-alpha5` | 4.400 ± 0.371 | 3.913 | 4.807 | 2.05 ± 0.17 | | `Godot_v4.0-alpha6` | 4.511 ± 0.319 | 4.112 | 4.916 | 2.10 ± 0.15 | | `Godot_v4.0-alpha7` | 4.547 ± 0.347 | 4.087 | 4.918 | 2.11 ± 0.16 | | `Godot_v4.0-alpha8` | 4.623 ± 0.342 | 4.130 | 4.912 | 2.15 ± 0.16 | | `Godot_v4.0-alpha9` | 4.562 ± 0.365 | 4.174 | 5.174 | 2.12 ± 0.17 | | `Godot_v4.0-alpha10` | 5.928 ± 0.415 | 5.389 | 6.325 | 2.76 ± 0.19 |

Warm editor start + exit

| Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | `Godot_v3.0.1-stable` | 2.158 ± 0.003 | 2.154 | 2.168 | 1.00 ± 0.00 | | `Godot_v3.0.2-stable` | 2.159 ± 0.004 | 2.151 | 2.167 | 1.00 ± 0.00 | | `Godot_v3.0.3-stable` | 2.155 ± 0.004 | 2.150 | 2.164 | 1.00 ± 0.00 | | `Godot_v3.0.4-stable` | 2.154 ± 0.008 | 2.145 | 2.187 | 1.00 ± 0.00 | | `Godot_v3.0.5-stable` | 2.152 ± 0.004 | 2.145 | 2.163 | 1.00 ± 0.00 | | `Godot_v3.0.6-stable` | 2.151 ± 0.003 | 2.146 | 2.161 | **1.00 (fastest)** | | `Godot_v3.1-stable` | 2.218 ± 0.004 | 2.212 | 2.229 | 1.03 ± 0.00 | | `Godot_v3.1.1-stable` | 2.219 ± 0.004 | 2.213 | 2.226 | 1.03 ± 0.00 | | `Godot_v3.1.2-stable` | 2.231 ± 0.004 | 2.223 | 2.241 | 1.04 ± 0.00 | | `Godot_v3.2-stable` | 2.219 ± 0.006 | 2.208 | 2.232 | 1.03 ± 0.00 | | `Godot_v3.2.1-stable` | 2.220 ± 0.004 | 2.215 | 2.232 | 1.03 ± 0.00 | | `Godot_v3.2.2-stable` | 2.221 ± 0.004 | 2.215 | 2.232 | 1.03 ± 0.00 | | `Godot_v3.2.3-stable` | 2.733 ± 0.008 | 2.724 | 2.760 | 1.27 ± 0.00 | | `Godot_v3.3-stable` | 3.798 ± 0.050 | 3.689 | 3.856 | 1.77 ± 0.02 | | `Godot_v3.3.1-stable` | 3.797 ± 0.027 | 3.744 | 3.853 | 1.76 ± 0.01 | | `Godot_v3.3.2-stable` | 3.760 ± 0.155 | 3.041 | 3.849 | 1.75 ± 0.07 | | `Godot_v3.3.3-stable` | 3.802 ± 0.033 | 3.684 | 3.851 | 1.77 ± 0.02 | | `Godot_v3.3.4-stable` | 3.787 ± 0.040 | 3.691 | 3.848 | 1.76 ± 0.02 | | `Godot_v3.4-stable` | 3.797 ± 0.065 | 3.689 | 3.896 | 1.76 ± 0.03 | | `Godot_v3.4.1-stable` | 3.831 ± 0.034 | 3.748 | 3.896 | 1.78 ± 0.02 | | `Godot_v3.4.2-stable` | 3.827 ± 0.047 | 3.695 | 3.900 | 1.78 ± 0.02 | | `Godot_v3.4.3-stable` | 3.793 ± 0.067 | 3.692 | 3.904 | 1.76 ± 0.03 | | `Godot_v3.4.4-stable` | 3.781 ± 0.175 | 2.999 | 3.904 | 1.76 ± 0.08 | | `Godot_v3.5-beta1` | 3.790 ± 0.166 | 3.049 | 3.906 | 1.76 ± 0.08 | | `Godot_v3.5-beta2` | 3.876 ± 0.039 | 3.742 | 3.915 | 1.80 ± 0.02 | | `Godot_v3.5-beta3` | 4.134 ± 0.189 | 3.233 | 4.206 | 1.92 ± 0.09 | | `Godot_v3.5-beta4` | 3.117 ± 0.050 | 2.982 | 3.152 | 1.45 ± 0.02 | | `Godot_v3.5-beta5` | 3.081 ± 0.187 | 2.217 | 3.184 | 1.43 ± 0.09 | | `Godot_v3.5-rc1` | 3.132 ± 0.011 | 3.086 | 3.146 | 1.46 ± 0.01 | | `Godot_v3.5-rc2` | 3.134 ± 0.005 | 3.124 | 3.146 | 1.46 ± 0.00 | | `Godot_v3.5-rc3` | 3.360 ± 0.146 | 2.985 | 3.450 | 1.56 ± 0.07 | | `Godot_v3.5-rc4` | 3.545 ± 0.160 | 3.230 | 3.726 | 1.65 ± 0.07 | | `Godot_v4.0-dev.20210727` | 3.421 ± 0.408 | 2.872 | 3.801 | 1.59 ± 0.19 | | `Godot_v4.0-dev.20210811` | 3.371 ± 0.390 | 2.887 | 3.832 | 1.57 ± 0.18 | | `Godot_v4.0-dev.20210820` | 3.356 ± 0.422 | 2.805 | 3.820 | 1.56 ± 0.20 | | `Godot_v4.0-dev.20210916` | 3.419 ± 0.385 | 2.925 | 3.804 | 1.59 ± 0.18 | | `Godot_v4.0-dev.20210924` | 3.380 ± 0.389 | 2.788 | 3.815 | 1.57 ± 0.18 | | `Godot_v4.0-dev.20211004` | 3.469 ± 0.370 | 2.954 | 3.815 | 1.61 ± 0.17 | | `Godot_v4.0-dev.20211015` | 3.414 ± 0.375 | 2.954 | 3.813 | 1.59 ± 0.17 | | `Godot_v4.0-dev.20211027` | 3.408 ± 0.398 | 2.905 | 3.856 | 1.58 ± 0.19 | | `Godot_v4.0-dev.20211108` | 3.309 ± 0.397 | 2.869 | 3.859 | 1.54 ± 0.18 | | `Godot_v4.0-dev.20211117` | 3.437 ± 0.400 | 2.911 | 3.851 | 1.60 ± 0.19 | | `Godot_v4.0-dev.20211210` | 3.589 ± 0.376 | 2.946 | 3.898 | 1.67 ± 0.17 | | `Godot_v4.0-dev.20220105` | 3.480 ± 0.386 | 2.967 | 3.896 | 1.62 ± 0.18 | | `Godot_v4.0-dev.20220118` | 3.259 ± 0.337 | 2.920 | 3.795 | 1.51 ± 0.16 | | `Godot_v4.0-alpha1` | 3.233 ± 0.375 | 2.772 | 3.821 | 1.50 ± 0.17 | | `Godot_v4.0-alpha2` | 3.307 ± 0.352 | 2.895 | 3.778 | 1.54 ± 0.16 | | `Godot_v4.0-alpha3` | 3.458 ± 0.377 | 2.925 | 3.842 | 1.61 ± 0.18 | | `Godot_v4.0-alpha4` | 3.402 ± 0.355 | 2.911 | 3.846 | 1.58 ± 0.17 | | `Godot_v4.0-alpha5` | 3.352 ± 0.386 | 2.762 | 3.810 | 1.56 ± 0.18 | | `Godot_v4.0-alpha6` | 3.571 ± 0.384 | 2.958 | 3.895 | 1.66 ± 0.18 | | `Godot_v4.0-alpha7` | 3.340 ± 0.377 | 2.932 | 3.890 | 1.55 ± 0.18 | | `Godot_v4.0-alpha8` | 3.359 ± 0.371 | 2.935 | 3.892 | 1.56 ± 0.17 | | `Godot_v4.0-alpha9` | 3.574 ± 0.452 | 2.823 | 4.192 | 1.66 ± 0.21 | | `Godot_v4.0-alpha10` | 4.054 ± 0.414 | 3.510 | 4.509 | 1.88 ± 0.19 |

Conclusion

The main takeaway is that 4.0.alpha10 is much slower to startup and shutdown compared to 4.0.alpha9, especially in cold runs. This suggests that the shader cache has more work to do since more shader variants have to be compiled, presumably due to TAA. cc @JFonS

4.0.alpha6 also has a noticeable regression in startup/shutdown speed compared to 4.0.alpha5 (both cold and warm), but mainly in the project manager.

filipworksdev commented 2 years ago

I'm on Windows and I get close to 10 seconds of boot time significantly slower than 3.x. Also the entire experience is very sluggish until I turn off multi window mode then is fine.

Nukiloco commented 2 years ago

For me this is around 30 seconds usually on 4.0. Before on 3.3 it launched under a couple of seconds.

filipworksdev commented 2 years ago

My issue specifically has to do with Iris Xe driver multi window & Godot 4 compatibility on Windows . Should I make a new post?

SonnyBonds commented 2 years ago

I have this problem as well and looked into it a bit. I don't have a solution but my findings so far is that one call to vkCreateGraphicsPipelines (for the shader ClusterRenderShaderRD:0) takes 30s. What happens is that the separate MTLCompilerService process crashes, and the caller times out & continues after 10 seconds. (It does 3 retries however, hence a 30s total delay). So it doesn't seem to be a shader cache performance issue per se like @Calinou investigated, but a bug, possibly in the MoltenVK/Metal backend.

As for why it's crashing is a bit more unclear. The callstack of the crashed thread in MTLCompilerService is:

0   libsystem_kernel.dylib              0x7ff819438dba __abort_with_payload + 10
1   libsystem_kernel.dylib              0x7ff81943a877 abort_with_payload_wrapper_internal + 80
2   libsystem_kernel.dylib              0x7ff81943a827 abort_with_reason + 19
3   MTLCompiler                         0x7ffa32221d4f fatalErrorHandler(void*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool) + 635
4   libLLVM.dylib                       0x7ffa31e86986 llvm::report_fatal_error(llvm::Twine const&, bool) + 323
5   libLLVM.dylib                       0x7ffa31e869c0 llvm::report_fatal_error(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool) + 29
6   libIGIL-Metal.dylib                 0x7ffa295242dd IGILErrorMessageHandlers::reportUnsupportedFunctionReferenceError(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 42
7   libIGIL-Metal.dylib                 0x7ffa2954532f (anonymous namespace)::BiFInliner::runOnModule(llvm::Module&) + 3877
8   libLLVM.dylib                       0x7ffa31af3c15 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 559
9   libMTLIntelCompilerPlugin.dylib        0x10291acb9 MTLIntelCompiler::generateIGIL(llvm::Module*, MTLIntelFunctionType, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&) + 669
10  libMTLIntelCompilerPlugin.dylib        0x10290fa3b MTLCompilerBuildRequestWithOptions + 523
11  MTLCompiler                         0x7ffa3221276a MTLCompilerPluginInterface::compilerBuildRequest(bool, unsigned int, void const*, unsigned long, unsigned int, void const*, BackendCompilationOutput&) + 276
12  MTLCompiler                         0x7ffa3220cd11 MTLCompilerObject::backendCompileModule(MTLCompilerObject::BinaryRequestData&, BackendCompilationOutput&, unsigned long long, std::__1::vector<CompileTimeData, std::__1::allocator<CompileTimeData> >&) + 159
13  MTLCompiler                         0x7ffa322129f7 MTLCompilerObject::backendCompileExecutableRequest(MTLCompilerObject::BinaryRequestData&) + 517
14  MTLCompiler                         0x7ffa32221658 MTLCompilerObject::buildRequest(unsigned int, unsigned int, void const*, unsigned long, void (unsigned int, void const*, unsigned long, char const*) block_pointer) + 258
15  MTLCompiler                         0x7ffa321f822d split_stack_call + 13
16  MTLCompiler                         0x7ffa32212cef MTLCodeGenServiceBuildRequest + 265
17  MTLCompilerService                     0x1026d57e3 invocation function for block in MTLCompilerServiceHandleEvent(NSObject<OS_xpc_object>*) + 793
18  libxpc.dylib                        0x7ff819191b70 _xpc_connection_call_event_handler + 56
19  libxpc.dylib                        0x7ff819190956 _xpc_connection_mach_event + 1413
20  libdispatch.dylib                   0x7ff81929b3b1 _dispatch_client_callout4 + 9
21  libdispatch.dylib                   0x7ff8192b4041 _dispatch_mach_msg_invoke + 445
22  libdispatch.dylib                   0x7ff8192a11cd _dispatch_lane_serial_drain + 342
23  libdispatch.dylib                   0x7ff8192b4b77 _dispatch_mach_invoke + 484
24  libdispatch.dylib                   0x7ff8192a11cd _dispatch_lane_serial_drain + 342
25  libdispatch.dylib                   0x7ff8192a1dfd _dispatch_lane_invoke + 366
26  libdispatch.dylib                   0x7ff8192abeee _dispatch_workloop_worker_thread + 753
27  libsystem_pthread.dylib             0x7ff81944efd0 _pthread_wqthread + 326
28  libsystem_pthread.dylib             0x7ff81944df57 start_wqthread + 15

It seems to me like it encounters an error, but fails to properly report it and crashes instead. reportUnsupportedFunctionReferenceError is seems like the biggest hint to to the cause to me. I have not been able to attach to the process with a debugger to try to extract more info. (The process is a bit transient.)

While compiling it also emits [MTLCompiler] Error: undefined reference to `air_simd_is_helper_thread()' to the system log. This happens more times than for the failing call though so I'm not sure it's related.

I don't know if there are any more Vulkan/Metal logs to check, but let me know if there's some place I should have a peek.

Calinou commented 1 year ago

There's a fair chance this issue is a MoltenVK bug, which we unfortunately can't fix on our end.

okla commented 1 year ago

Developers of the Dolphin emulator experienced some MoltenVK performance regression recently, maybe this can help https://dolphin-emu.org/blog/2022/09/13/dolphin-progress-report-july-and-august-2022/

SonnyBonds commented 1 year ago

There's a fair chance this issue is a MoltenVK bug, which we unfortunately can't fix on our end.

Yes, unfortunately. It's maybe possible something in the shader triggers the error and can be avoided by adjusting it somehow, but the error reports I currently have available aren't that much to work with.

SonnyBonds commented 1 year ago

It does seem like the MTLCompilerService reports (or tries to report) an error. Considering the "reportUnsupportedFunctionReferenceError", my wild guess without knowing Vulkan would be that the shader uses a function that isn't supported by the MoltenVK backend or something?

If I could see the actual error output I could maybe see exactly what, or at least more error information. The error report seems to lead to an abort of the process though and I don't know if it actually writes the error output anywhere.

kdada commented 1 year ago

MacBook Pro (Retina, 15-inch, Mid 2014) has the same question. OS: macOS Monterey 12.6.1 (21G211)

arguments
0: ./bin/godot.macos.editor.x86_64
Current path: /Users/home/dev/c/godot
Godot Engine v4.0.beta.custom_build.dc4b61659 - https://godotengine.org
Vulkan API 1.1.224 - Using Vulkan Device #0: Intel - Intel Iris Pro Graphics
[mvk-error] VK_ERROR_INITIALIZATION_FAILED: Render pipeline compile failed (Error code 2):
Compiler encountered an internal error.
ERROR: vkCreateGraphicsPipelines failed with error -3 for shader 'ClusterRenderShaderRD:0'.
   at: render_pipeline_create (drivers/vulkan/rendering_device_vulkan.cpp:6848)
[mvk-error] VK_ERROR_INITIALIZATION_FAILED: Render pipeline compile failed (Error code 2):
Compiler encountered an internal error.
ERROR: vkCreateGraphicsPipelines failed with error -3 for shader 'ClusterRenderShaderRD:0'.
   at: render_pipeline_create (drivers/vulkan/rendering_device_vulkan.cpp:6848)

Only open the editor without arguments causes the question. If start it with a project, it works fine:

Editing project: /Users/home/dev/godot-projects/study-godot
arguments
0: /Users/home/dev/c/godot/bin/godot.macos.editor.x86_64
1: --path
2: /Users/home/dev/godot-projects/study-godot
3: --editor
Current path: /Users/home/dev/c/godot
TextServer: Added interface "Dummy"
TextServer: Added interface "ICU / HarfBuzz / Graphite (Built-in)"
Godot Engine v4.0.beta.custom_build.dc4b61659 - https://godotengine.org
Text-to-Speech: AVSpeechSynthesizer initialized.
Vulkan devices:
  #0: Intel Intel Iris Pro Graphics - Supported, Integrated
Vulkan API 1.1.224 - Using Vulkan Device #0: Intel - Intel Iris Pro Graphics
- Vulkan Variable Rate Shading not supported
- Vulkan multiview supported:
  max view count: 32
  max instances: 134217727
- Vulkan subgroup:
  size: 1
  stages: STAGE_TESSELLATION_CONTROL, STAGE_FRAGMENT, STAGE_COMPUTE
  supported ops: FEATURE_BASIC
Using present mode: VK_PRESENT_MODE_FIFO_KHR
Using present mode: VK_PRESENT_MODE_FIFO_KHR
Using present mode: VK_PRESENT_MODE_FIFO_KHR
Using "default" pen tablet driver...
Creating VMA small objects pool for memory type index 0
Shader 'CanvasSdfShaderRD' SHA256: c2b8e430fdb6bc4f9dae9edb9ece49ed09d5e1a0a72839e0b528693b38dc825a
Shader 'SkeletonShaderRD' SHA256: 49219ef3611affa34d8667e7d5e4e9b6cd590a574bb30a94d830339b7fa0f209
Shader 'ParticlesShaderRD' SHA256: 54d92ed371ee9c28fc72fcc586d3fe7da7293912dc78b205dcce92a9ed0b47a2
Shader 'ParticlesCopyShaderRD' SHA256: 02cc2dd608a6a366cda7ba880285d0496388a7a768ac3944d53bdb03e1571647
Shader 'CanvasShaderRD' SHA256: cd7448c742ea0f82dc39da02b3dcceae6a9056ec0511ee38b855d8b771955770
Shader 'CanvasOcclusionShaderRD' SHA256: 86f7bebdee38de285be45366523c21227ce5a7f2e96000cb877e8f3c8ba84394
Shader 'SceneForwardMobileShaderRD' SHA256: cbec962cb9e9757757c60b669bb5119901d09e7390eb067b44d4a428ba53b83a
Shader 'SkyShaderRD' SHA256: 23299e9bcb1dc738340eb8ff1ab9c45ba4098198f0b01a34494e4bd89158605a
Shader 'BokehDofRasterShaderRD' SHA256: f52d885ad3555dd177ac12bc9f8219b9194c00bb8ede87652f48ae75f93a3c00
Shader 'BlurRasterShaderRD' SHA256: 3dbe26b0efa519964415a64dcf3e59bf566741238929192b15530af79eea82f6
Shader 'CopyToFbShaderRD' SHA256: 0224e16dee0e0f9ebb10f0af3949bcd4aa4ce30a45442874a73edc8a006c95eb
Shader 'CubeToDpShaderRD' SHA256: e2651a294ae637b9467b915c975c23c80c352f9e625c0920c222e0333df2bff9
Shader 'CubemapDownsamplerRasterShaderRD' SHA256: a312c4287cb023c4beb6be334165c0a1c3f00bf38a7d26a39c34f707a175b0a1
Shader 'CubemapFilterRasterShaderRD' SHA256: 8383af97193f44c82a0479da18ac1e6bd1f10f4806877b0f675ae901f0207826
Shader 'CubemapRoughnessRasterShaderRD' SHA256: 32d1ed8b0757ac29e232f75197eead65b037d76642e0be2c7f3b844cbad5874b
Shader 'SpecularMergeShaderRD' SHA256: 16dbc1d4283379fc744d3e4fb0f716bf10b1435b3321de71baa09f22be07e618
Shader 'TonemapShaderRD' SHA256: acb166da64acd1fb5779aa07e138f1dc1bf18b23149a44e7de22530b0cab7bca
Shader 'VrsShaderRD' SHA256: b4c65272641e8295d9adc74feea3f0a3c28bf23633baacfb82513455dd918528
Shader 'LuminanceReduceRasterShaderRD' SHA256: 4232742a65afd9a9abb4b67aeec2ae15a6630015327d84f94c53ad4e06b96367
Shader 'SortShaderRD' SHA256: be96e5b562e2ae475d9cb09911d6521bbece783b980c743b15f90cc50eb81c8d
Shader 'BlitShaderRD' SHA256: 0628016b1d385dde0e65177688efbf6c7c0235fd9a936c23fd0ca5e52537a5fb
CoreAudio: detected 2 channels
CoreAudio: audio buffer frames: 512 calculated latency: 11ms

TextServer: Primary interface set to: "ICU / HarfBuzz / Graphite (Built-in)".
CameraServer: Registered camera FaceTime HD Camera with ID 1 and position 0 at index 0
CORE API HASH: 516139552
EDITOR API HASH: 2976406307
Class 'DisplayServerMacOS' is not exposed, skipping.
Class 'EditorPropertyNameProcessor' is not exposed, skipping.
Class 'FramebufferCacheRD' is not exposed, skipping.
Class 'GDScriptEditorTranslationParserPlugin' is not exposed, skipping.
Class 'GDScriptNativeClass' is not exposed, skipping.
Class 'GodotPhysicsDirectSpaceState2D' is not exposed, skipping.
Class 'GodotPhysicsDirectSpaceState3D' is not exposed, skipping.
Class 'GodotPhysicsServer2D' is not exposed, skipping.
Class 'GodotPhysicsServer3D' is not exposed, skipping.
Class 'IPUnix' is not exposed, skipping.
Class 'MovieWriterMJPEG' is not exposed, skipping.
Class 'MovieWriterPNGWAV' is not exposed, skipping.
Class 'ResourceImporterMP3' is not exposed, skipping.
Class 'ResourceImporterOggVorbis' is not exposed, skipping.
Class 'SceneCacheInterface' is not exposed, skipping.
Class 'SceneRPCInterface' is not exposed, skipping.
Class 'SceneReplicationInterface' is not exposed, skipping.
Class 'UniformSetCacheRD' is not exposed, skipping.
EditorSettings: Load OK!

I also tested it on MacBook Pro (16-inch, Mid 2019) with same OS version and same Vulkan API version, it worked properly.

Calinou commented 1 year ago

@kdada This is an unrelated issue. Also, your project is using the Forward Mobile backend, which is more compatible with lower-end desktop GPUs (including old Macs and MoltenVK in general). On the other hand, the project manager always uses the Forward Plus backend, even on GPUs that don't support it well.

The project manager should probably always use the Forward Mobile backend (or Compatibility on GPUs that don't support Vulkan), as there is no point in using the Forward Plus backend for it to my knowledge.

SonnyBonds commented 1 year ago

Ok so I did a bit of a deep dive on this one, and basically yes the Metal compiler crashes and [MTLCompiler] Error: undefined reference toair_simd_is_helper_thread()` is the closest to a detailed error there is. Unfortunately this is in internal proprietary code and not much to work with so that's a bit of a dead end.

However, with a bit of testing I figured out that what actually triggers this error is the usage of gl_HelperInvocation in cluster_render.glsl:

https://github.com/godotengine/godot/blob/c660cc4adc7032c11f74dfce5fcb7b5a02f6d097/servers/rendering/renderer_rd/shaders/cluster_render.glsl#L164

In my case it's taking the non-USE_SUBGROUPSpath and removing those two conditionals (i.e. running the atomicOr always) fixes the compiler failure and Godot starts fast.

I'm not sure making a pull request of that makes sense because those conditionals are obviously there for a reason, and I'm not well versed enough in the shaders to know what the best alternative is. (As I understand it they're there to avoid doing unnecessary calculations when the shader is run in a helper mode that doesn't actually output anything, so removing it likely doesn't cause any errors but may have performance effects. If I read the docs correctly atomicOr doesn't actually do anything when in a helper invocation though so possibly it's just basically a free NOP anyway, but someone who actually knows what they're talking about should chime in.)

As an extra reference, this specific code is actually discussed in a separate but related discussion about subgroups + helper invocations in a Khronos GLSL issue, mentioning @reduz : https://github.com/KhronosGroup/GLSL/issues/35

I'd be happy to test alternative fixes on this computer where it normally fails, just let me know.

clayjohn commented 1 year ago

@SonnyBonds I think removing the if (!gl_HelperInvocation) branches in the non-USE_SUPGROUPS paths should be okay. In those cases there is just a wasteful atomicOr as the Vulkan spec guarantees that atomics by helper invocations will not have an effect on actual memory

It would be helpful if you could test a version with the if (!gl_HelperInvocation) branches removed on a moderately complex 3D scene (i.e. lots of lights and geometry). Errors should be pretty visible

SonnyBonds commented 1 year ago

In those cases there is just a wasteful atomicOr as the Vulkan spec guarantees that atomics by helper invocations will not have an effect on actual memory

Yeah, exactly. It doesn't really specify if they're also "free" though, but I'll try to see if I can do some performance tests. Are there any useful benchmarking scenes readily available somewhere?

I could probably also wrap the conditional so it's only skipped when using MoltenVK. I'll have a look and make a PR when I have some results.

clayjohn commented 1 year ago

Yeah, exactly. It doesn't really specify if they're also "free" though, but I'll try to see if I can do some performance tests. Are there any useful benchmarking scenes readily available somewhere?

You can take a look at the 4.0-dev branch of the demo projects, some have been updated for 4.0. I think the antialising demo should work if you add a bunch of lights to it https://github.com/godotengine/godot-demo-projects/tree/4.0-dev. For stuff like this I usually just use sponza from here https://casual-effects.com/data/ and then add some smaller meshes and lights manually

I could probably also wrap the conditional so it's only skipped when using MoltenVK. I'll have a look and make a PR when I have some results.

That would be fine as well!

Calinou commented 1 year ago

Are there any useful benchmarking scenes readily available somewhere?

You can use https://github.com/Calinou/godot-reflection for this purpose :slightly_smiling_face:

I've tested it and can confirm it works on 4.0.beta3.

filipworksdev commented 1 year ago

I think there needs to be a new issue since a lot of people seem to be reporting on other issues some graphic driver related slowdowns. I think the issue is from Vulkan implementation in Godot that is causing slowdowns probably because there is something that's not being properly initialized and the application waits for it?

Calinou commented 1 year ago

I think there needs to be a new issue since a lot of people seem to be reporting on other issues some graphic driver related slowdowns. I think the issue is from Vulkan implementation in Godot that is causing slowdowns probably because there is something that's not being properly initialized and the application waits for it?

This is already being tracked in https://github.com/godotengine/godot/issues/43351. In general, Vulkan is expected to be slower to initialize compared to OpenGL – it's a more complex rendering backend.

ghashy commented 1 year ago

Hello, I want to apologize if I wrote this in the wrong place, I have 2 videos, on the first video I shot errors when launching Godot 4 beta 4, and the launch time itself; and on the second one, I recorded the interface lags that appeared in the new beta version

https://user-images.githubusercontent.com/109857267/200064400-16474bff-eae1-4de5-843b-54e643e48782.mov

https://user-images.githubusercontent.com/109857267/200065315-34411a8c-0557-49d4-bb9a-3cadf7809d80.mp4

Calinou commented 1 year ago

@Ghashy See https://github.com/godotengine/godot/issues/68269. There appears to be a performance regression between 4.0.beta3 and 4.0.beta4. If you can compile the engine from source, you could look into bisecting the regression to greatly speed up troubleshooting.

AlyssaDaemon commented 1 year ago

Not sure if it helps but:

M1 Mac Mini here (Monterey v. 12.6.1), was unable to reproduce on a new Forward+ project, even on cold start. Mine only take a few seconds to load at most (both mono and without), doesn't matter which beta (tested 1-4) nor 3.5.1 nor 3.5 (using a GLES3.0 renderer)

All of them were sub 5 seconds. Not sure if it'll help with debugging at all or not.

My molten-vk install info in case that helps:

➜  ~ brew info molten-vk
==> molten-vk: stable 1.2.0 (bottled), HEAD
Implementation of the Vulkan graphics and compute API on top of Metal
https://github.com/KhronosGroup/MoltenVK
/opt/homebrew/Cellar/molten-vk/1.2.0 (134 files, 83.1MB) *
  Poured from bottle on 2022-10-21 at 02:37:59
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/molten-vk.rb

system_profiler data:

Graphics/Displays:

    Apple M1:

      Chipset Model: Apple M1
      Type: GPU
      Bus: Built-In
      Total Number of Cores: 8
      Vendor: Apple (0x106b)
      Metal Family: Supported, Metal GPUFamily Apple 7
      Displays:
        LG HDR 4K:
          Resolution: 3840 x 2160 (2160p/4K UHD 1 - Ultra High Definition)
          UI Looks like: 3840 x 2160 @ 60.00Hz
          Main Display: Yes
          Mirror: Off
          Online: Yes
          Rotation: Supported
          Automatically Adjust Brightness: No