ec- / Quake3e

Improved Quake III Arena engine
GNU General Public License v2.0
1.2k stars 154 forks source link

Low Vulkan performance on M1 pro #127

Open ph1lm opened 2 years ago

ph1lm commented 2 years ago

MacBook Pro (14-inch, 2021) macOS Monterey 12.0.1

Dependencies

molten-vk: stable 1.1.5
sdl2: stable 2.0.16

Both installed using brew.

I tweaked Makefile to update refs to sdl2 installed by brew:

sdl2-config --cflags                                     
-I/opt/homebrew/include/SDL2 -D_THREAD_SAFE

sdl2-config --libs  
-L/opt/homebrew/lib -lSDL2

git diff        
diff --git a/Makefile b/Makefile
index 7b874c26..4396f8f3 100644
--- a/Makefile
+++ b/Makefile
@@ -417,8 +417,8 @@ ifeq ($(COMPILE_PLATFORM),darwin)
     BASE_CFLAGS += $(SDL_INCLUDE)
     CLIENT_LDFLAGS = $(SDL_LIBS)
   else
-    BASE_CFLAGS += -I/Library/Frameworks/SDL2.framework/Headers
-    CLIENT_LDFLAGS = -F/Library/Frameworks -framework SDL2
+    BASE_CFLAGS += -I/opt/homebrew/include/SDL2 -D_THREAD_SAFE
+    CLIENT_LDFLAGS = -F/Library/Frameworks -L/opt/homebrew/lib -lSDL2
   endif

   DEBUG_CFLAGS = $(BASE_CFLAGS) -DDEBUG -D_DEBUG -g -O0

Builds openGL:

make clean release BUILD_SERVER=0 USE_RENDERER_DLOPEN=0 USE_VULKAN=0 RENDERER_DEFAULT=opengl

Vulkan:

make clean release BUILD_SERVER=0 USE_RENDERER_DLOPEN=0 USE_OPENGL=0 RENDERER_DEFAULT=vulkan

Q3 Config I used cvar_restart to reset everything to defaults. Key commands:

r_vbo 0  // left it as default
r_fbo 0
r_swapInterval 0
com_maxfps 125
r_displayrefresh 0
com_hunkMegs 512 // increased it
com_soundmegs 128 // increased it
com_zonemegs 128 // increased it

Performance Test

timedemo 1
demo four

Results openGL: 1100-1200fps Vulkan: 400-420fps

I would appreciate any help to find what config parameter causes the slowness of the Vulkan renderer which is supposed to be faster than openGL as far as I understand.

ec- commented 2 years ago

Set \r_vbo 1, it may help a bit.

I can only suggest that engine/moltenvk uses some suboptimal buffer or texture format - that may cause such big performance drop, or it is a presentation (swapchain) issues, try some heavy maps like dfwc2019-6

ph1lm commented 2 years ago

I increased swapchains from 2 to 3 like it's suggested here: https://github.com/KhronosGroup/MoltenVK/blob/master/Docs/MoltenVK_Runtime_UserGuide.md#swapchains and here: https://github.com/KhronosGroup/MoltenVK/issues/742

diff --git a/code/renderervk/vk.c b/code/renderervk/vk.c
index 0fa9fbf4..8769982f 100644
--- a/code/renderervk/vk.c
+++ b/code/renderervk/vk.c
@@ -472,8 +472,8 @@ static void vk_create_swapchain( VkPhysicalDevice physical_device, VkDevice devi
            present_mode = VK_PRESENT_MODE_FIFO_KHR;
            image_count = MAX(MIN_SWAPCHAIN_IMAGES_FIFO, surface_caps.minImageCount);
        }
-       if ( image_count < 2 ) {
-           image_count = 2;
+       if ( image_count < 3 ) {
+           image_count = 3;
        }
    }

and I also enabled \r_vbo 1 Below is what was logged during startup:

----- R_Init -----
SDL using driver "cocoa"
Initializing Vulkan display
...setting mode -2: 1512 982
Using 24 color bits, 24 depth, 8 stencil display.
[mvk-info] MoltenVK version 1.1.5, supporting Vulkan version 1.1.189.
    The following 72 Vulkan extensions are supported:
        VK_KHR_16bit_storage v1
        VK_KHR_8bit_storage v1
        VK_KHR_bind_memory2 v1
        VK_KHR_create_renderpass2 v1
        VK_KHR_dedicated_allocation v3
        VK_KHR_depth_stencil_resolve v1
        VK_KHR_descriptor_update_template v1
        VK_KHR_device_group v4
        VK_KHR_device_group_creation v1
        VK_KHR_driver_properties v1
        VK_KHR_external_fence v1
        VK_KHR_external_fence_capabilities v1
        VK_KHR_external_memory v1
        VK_KHR_external_memory_capabilities v1
        VK_KHR_external_semaphore v1
        VK_KHR_external_semaphore_capabilities v1
        VK_KHR_get_memory_requirements2 v1
        VK_KHR_get_physical_device_properties2 v2
        VK_KHR_get_surface_capabilities2 v1
        VK_KHR_imageless_framebuffer v1
        VK_KHR_image_format_list v1
        VK_KHR_maintenance1 v2
        VK_KHR_maintenance2 v1
        VK_KHR_maintenance3 v1
        VK_KHR_multiview v1
        VK_KHR_portability_subset v1
        VK_KHR_push_descriptor v2
        VK_KHR_relaxed_block_layout v1
        VK_KHR_sampler_mirror_clamp_to_edge v3
        VK_KHR_sampler_ycbcr_conversion v14
        VK_KHR_shader_draw_parameters v1
        VK_KHR_shader_float16_int8 v1
        VK_KHR_shader_subgroup_extended_types v1
        VK_KHR_storage_buffer_storage_class v1
        VK_KHR_surface v25
        VK_KHR_swapchain v70
        VK_KHR_swapchain_mutable_format v1
        VK_KHR_timeline_semaphore v2
        VK_KHR_uniform_buffer_standard_layout v1
        VK_KHR_variable_pointers v1
        VK_EXT_debug_marker v4
        VK_EXT_debug_report v10
        VK_EXT_debug_utils v2
        VK_EXT_descriptor_indexing v2
        VK_EXT_fragment_shader_interlock v1
        VK_EXT_hdr_metadata v2
        VK_EXT_host_query_reset v1
        VK_EXT_image_robustness v1
        VK_EXT_inline_uniform_block v1
        VK_EXT_memory_budget v1
        VK_EXT_metal_surface v1
        VK_EXT_post_depth_coverage v1
        VK_EXT_private_data v1
        VK_EXT_robustness2 v1
        VK_EXT_scalar_block_layout v1
        VK_EXT_shader_stencil_export v1
        VK_EXT_shader_viewport_index_layer v1
        VK_EXT_subgroup_size_control v2
        VK_EXT_swapchain_colorspace v4
        VK_EXT_texel_buffer_alignment v1
        VK_EXT_texture_compression_astc_hdr v1
        VK_EXT_vertex_attribute_divisor v3
        VK_AMD_gpu_shader_half_float v2
        VK_AMD_negative_viewport_height v1
        VK_AMD_shader_image_load_store_lod v1
        VK_AMD_shader_trinary_minmax v1
        VK_IMG_format_pvrtc v1
        VK_INTEL_shader_integer_functions2 v1
        VK_GOOGLE_display_timing v1
        VK_MVK_macos_surface v3
        VK_MVK_moltenvk v32
        VK_NV_glsl_shader v1
[mvk-info] GPU device:
        model: Apple M1 Pro
        type: Integrated
        vendorID: 0x106b
        deviceID: 0xa140
        pipelineCacheUUID: 00002779-0400-03EF-0000-000000000000
    supports the following Metal Versions, GPU's and Feature Sets:
        Metal Shading Language 2.3
        GPU Family Apple 7
        GPU Family Apple 6
        GPU Family Apple 5
        GPU Family Apple 4
        GPU Family Apple 3
        GPU Family Apple 2
        GPU Family Apple 1
        GPU Family Mac 2
        GPU Family Mac 1
        GPU Family Common 3
        GPU Family Common 2
        GPU Family Common 1
        macOS GPU Family 2 v1
        macOS GPU Family 1 v4
        macOS GPU Family 1 v3
        macOS GPU Family 1 v2
        macOS GPU Family 1 v1
[mvk-info] Created VkInstance for Vulkan version 1.0.0, as requested by app, with the following 4 Vulkan extensions enabled:
        VK_KHR_surface v25
        VK_EXT_debug_utils v2
        VK_EXT_metal_surface v1
        VK_MVK_macos_surface v3
.......................
Available physical devices:
 0: Integrated Apple M1 Pro, 0xa140
.......................
...selected physical device: 0
[mvk-info] Using MTLEvent for Vulkan semaphores.
[mvk-info] Created VkDevice to run on GPU Apple M1 Pro with the following 4 Vulkan extensions enabled:
        VK_KHR_dedicated_allocation v3
        VK_KHR_get_memory_requirements2 v1
        VK_KHR_swapchain v70
        VK_EXT_debug_marker v4
...presentation modes: FIFO IMMEDIATE
...selected presentation mode: IMMEDIATE, image count: 3
[mvk-info] Created 3 swapchain images with initial size (1512, 982).                    <---- !!! 3 chains now

VK_VENDOR: Apple Inc.
VK_RENDERER: Integrated Apple M1 Pro, 0xa140
VK_VERSION: API: 1.1.189, Driver: 0.2.1913

VK_MAX_TEXTURE_SIZE: 2048
VK_MAX_TEXTURE_UNITS: 8

PIXELFORMAT: color(24-bits) Z(24-bit) stencil(8-bits)
 presentation: VK_FORMAT_B8G8R8A8_UNORM
 capture: VK_FORMAT_R8G8B8A8_UNORM
 depth: VK_FORMAT_D32_SFLOAT_S8_UINT
MODE: -2, 1512 x 982 fullscreen hz:120
GAMMA: hardware w/ 0 overbright bits
texturemode: GL_LINEAR_MIPMAP_NEAREST
texture bits: 32
picmip: 1
[mvk-error] VK_ERROR_FORMAT_NOT_SUPPORTED: VkPolygonMode value VK_POLYGON_MODE_POINT is not supported for render pipelines.
Initializing Shaders
----- finished R_Init -----

I'm still getting 400-420fps in demo four Will try to reach out to molten-vk devs...

ph1lm commented 2 years ago

I also tested it on Macbook Pro 2018 (Core i7 Radeon Pro 555x 4Gb) that runs macOS Big Sur (11.6.1): Opengl and Vulkan show 800fps and 700fps correspondingly in demo four there.

ph1lm commented 2 years ago

https://github.com/KhronosGroup/MoltenVK/issues/1471

ec- commented 2 years ago

Try motlenvk-1.1.6

ph1lm commented 2 years ago

@ec- same results

brew info molten-vk    
molten-vk: stable 1.1.6 (bottled)

export DYLD_LIBRARY_PATH=/opt/homebrew/Cellar/molten-vk/1.1.6/lib
export MVK_CONFIG_DISPLAY_WATERMARK=1

./quake3e --args +set com_hunkMegs 512 +seta com_soundmegs 128 +seta com_zonemegs 128

timedemo 1;demo four

400fps vulkan vs 1400fps opengl

rcaridade145 commented 2 years ago

...presentation modes: FIFO IMMEDIATE

You may have better performance with MAILBOX

https://github.com/jpd002/Play-/commit/47af30f7bb72a6bcdab1c2b393dcc1e7de6ec58c#diff-ca1fa89ffbb29849f0a4e198cb9a2e3d59cbc57a0c0a57c634036db04c5eacb5

https://github.com/ec-/Quake3e/blob/master/code/renderervk/vk.c#L453

r_swapInterval 2

Does it help?

ph1lm commented 2 years ago

@rcaridade145 it's even worse - ~250 fps with vulkan

kuncevic commented 2 years ago

Sometime I had to restart my m1 machine in order to fix the performance problem

ensiform commented 2 years ago

Has anyone reported the performance issues to upstream MoltenVK or Vulkan in these instances? Performance issues that require restarting the machine sound like more serious than something in Q3e's code itself.

kuncevic commented 2 years ago

I just incidentally notice that actually the lag problem accouter when I have more than two bluetooth devices connected to my macbook. So I have a bluetooth mouse that connected to bluetooth, it works fine when playing Quake3e, but once I have more bluetooth devices connected, for example headphones. Once disconnected it is all back to normal. So I believe it is not a Vulkan issue then in my case 🤔