Novum / vkQuake

Vulkan Quake port based on QuakeSpasm
GNU General Public License v2.0
1.79k stars 219 forks source link

Sigfault when loading The Immortal Lock #704

Closed Perlence closed 1 month ago

Perlence commented 1 month ago

Describe the bug

vkQuake v1.31.0 sigfaults on macOS when starting immortal.bsp from The Immortal Lock.

To reproduce

❯ build/vkquake -basedir ~/Games/Quake -heapsize 6291456 -game immortal +map immortal
...
vkQuake 1.31.0 Server (35659 CRC)

The Immortal Lock
Using protocol fte999
Warning: 2928 models exceeds QS limit of 2048 (max = 4096).
Allocating lightmap compute surface data (109498 KB)
Allocating indirect draw data (104 KB, 5336 draws)
Allocating indirect IBs (45512 KB)
Allocating visibility buffers (425 KB)
fish: Job 1, 'build/vkquake -basedir ~/Games/…' terminated by signal SIGSEGV (Address boundary error)

Full log

Expected behavior

The map loads 😄

Desktop

Additional context

I've also tried it on 38a9771 with the same result.

alexey-lysiuk commented 1 month ago

It's really hard to figure out anything from a stacktrace of vkQuake built with Release configuration of MoltenVK.

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x4)
  * frame #0: 0x00000001002585e2 vkquake`std::__1::pair<std::__1::__hash_iterator<std::__1::__hash_node<std::__1::__hash_value_type<unsigned int, unsigned int>, void*>*>, bool> std::__1::__hash_table<std::__1::__hash_value_type<unsigned int, unsigned int>, std::__1::__unordered_map_hasher<unsigned int, std::__1::__hash_value_type<unsigned int, unsigned int>, std::__1::hash<unsigned int>, std::__1::equal_to<unsigned int>, true>, std::__1::__unordered_map_equal<unsigned int, std::__1::__hash_value_type<unsigned int, unsigned int>, std::__1::equal_to<unsigned int>, std::__1::hash<unsigned int>, true>, std::__1::allocator<std::__1::__hash_value_type<unsigned int, unsigned int>>>::__emplace_unique_key_args<unsigned int, std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned int const&>, std::__1::tuple<>>(unsigned int const&, std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned int const&>&&, std::__1::tuple<>&&) + 178
    frame #1: 0x0000000100256b5c vkquake`void MVKDescriptorSet::write<VkWriteDescriptorSet>(VkWriteDescriptorSet const*, unsigned long, void const*) + 76
    frame #2: 0x00000001002568aa vkquake`mvkUpdateDescriptorSets(unsigned int, VkWriteDescriptorSet const*, unsigned int, VkCopyDescriptorSet const*) + 74
    frame #3: 0x00000001002689ea vkquake`vkUpdateDescriptorSets + 58
    frame #4: 0x00000001001d8feb vkquake`GL_UpdateLightmapDescriptorSets + 235
    frame #5: 0x0000000100166b96 vkquake`R_NewMap + 294
    frame #6: 0x00000001001394d1 vkquake`CL_ParseServerMessage + 11761
    frame #7: 0x0000000100135afa vkquake`CL_ReadFromServer + 138
    frame #8: 0x00000001001843c9 vkquake`_Host_Frame + 1705
    frame #9: 0x00000001001afb54 vkquake`main + 372
    frame #10: 0x00007ff81a74c366 dyld`start + 1942

Unless someone wants to build vkQuake with Debug configuration MoltenVK in order to produce more useful stacktrace, I would report this to MoltenVK developers as is.

vsonnier commented 1 month ago

@alexey-lysiuk Thank you very much to step up to this MacOs issue (probably) Indeed it is working for me but I'm on Windows with no access to Mac, so thanks for your help.

Novum commented 1 month ago

I have debugged this before and I'm pretty sure I came to the conclusion that this is a bug in MoltenVK

rtwomey commented 1 month ago

Unless someone wants to build vkQuake with Debug configuration MoltenVK in order to produce more useful stacktrace, I would report this to MoltenVK developers as is.

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xf90013e3f81e889a)
  * frame #0: 0x0000000102da7720 libMoltenVK.dylib`mvkUpdateDescriptorSets(writeCount=8, pDescriptorWrites=0x000000016fd9d2b0, copyCount=0, pDescriptorCopies=0x0000000000000000) at MVKDescriptorSet.mm:1100:66
    frame #1: 0x0000000102dd8d84 libMoltenVK.dylib`vkUpdateDescriptorSets(device=0x000000014f816818, writeCount=8, pDescriptorWrites=0x000000016fd9d2b0, copyCount=0, pDescriptorCopies=0x0000000000000000) at vulkan.mm:1257:2
    frame #2: 0x00000001000ccd80 vkquake`GL_UpdateLightmapDescriptorSets at r_brush.c:1889:3 [opt]
    frame #3: 0x000000010003ba44 vkquake`R_NewMap at gl_rmisc.c:3866:2 [opt]
    frame #4: 0x0000000100010ae0 vkquake`CL_ParseServerMessage [inlined] CL_ParseServerInfo at cl_parse.c:1040:2 [opt]
    frame #5: 0x0000000100010990 vkquake`CL_ParseServerMessage at cl_parse.c:1840:4 [opt]
    frame #6: 0x000000010000d870 vkquake`CL_ReadFromServer at cl_main.c:950:3 [opt]
    frame #7: 0x00000001000552a0 vkquake`_Host_Frame(time=<unavailable>) at host.c:957:3 [opt]
    frame #8: 0x00000001000554c4 vkquake`Host_Frame(time=<unavailable>) at host.c:1003:3 [opt] [artificial]
    frame #9: 0x00000001000797d8 vkquake`main(argc=<unavailable>, argv=<unavailable>) at main_sdl.c:122:4 [opt]
    frame #10: 0x000000018ad920e0 dyld`start + 2360

In case this helps. Latest vkQuake 564c14d and MoltenVK edbdcf05, compiled with debug enabled.

alexey-lysiuk commented 1 month ago

Report this to MoltenVK. Apparently, address=0xf90013e3f81e889a at crash location cannot be valid memory address in user space.

rtwomey commented 1 month ago

Issue opened at MoltenVK.

vsonnier commented 1 month ago

(Disclamer : I know next to nothing to Vulkan, so my suggestion may be stupid.)

As far as I understand the Issue opened at MoltenVK investigation, we may exhaust a pool of descriptors VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE.

Indeed instead of : https://github.com/Novum/vkQuake/blob/d5aae5a68daa779da768ab21b17796dfb6300429/Quake/gl_rmisc.c#L1506

shouldn't we have instead:

pool_sizes[7].descriptorCount = 32 + (1 + MAXLIGHTMAPS * 3 / 4) * MAX_SANITY_LIGHTMAPS;

The reason being, each lighmap (max MAX_SANITY_LIGHTMAPS) has a layout like this: gl_rmisc.c(1361)

lightmap_compute_layout_bindings[1].descriptorCount = 1;
lightmap_compute_layout_bindings[1].descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE;
lightmap_compute_layout_bindings[1].stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
lightmap_compute_layout_bindings[2].binding = num_descriptors++;
lightmap_compute_layout_bindings[2].descriptorCount = MAXLIGHTMAPS * 3 / 4;
lightmap_compute_layout_bindings[2].descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE;
lightmap_compute_layout_bindings[2].stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;

If we compare to the similar VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC pattern usage in lightmap, VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC pool allocation looks more logical.

vsonnier commented 1 month ago

@alexey-lysiuk @MrGcGamer Could you please try the suggested change above, to see if it still carshes for you ?

Perlence commented 1 month ago

@vsonnier Thanks, the map loads for me with the change you suggested and plays surprisingly well so far.

diff --git a/Quake/gl_rmisc.c b/Quake/gl_rmisc.c
index 1bc0ee2..f9f6d7a 100644
--- a/Quake/gl_rmisc.c
+++ b/Quake/gl_rmisc.c
@@ -1503,7 +1503,7 @@ void R_CreateDescriptorPool ()
        pool_sizes[6].type = VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT;
        pool_sizes[6].descriptorCount = 32;
        pool_sizes[7].type = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE;
-       pool_sizes[7].descriptorCount = 32;
+       pool_sizes[7].descriptorCount = 32 + (1 + MAXLIGHTMAPS * 3 / 4) * MAX_SANITY_LIGHTMAPS;
        int num_sizes = 8;
        if (vulkan_globals.ray_query)
        {
vsonnier commented 1 month ago

Thanks @Perlence for your feedback. Looks like we really had a bug in vkQuake, that goes unoticed until now...

Perlence commented 1 month ago

Would it be a good idea to make a test build with some sort of strict mode enabled in VK and run it to potentially uncover more hidden issues like this?

MrGcGamer commented 1 month ago

Screenshot 2024-07-28 at 19 23 33 No crash, but still heavy vertex explosions for me

vsonnier commented 1 month ago

No crash, but still heavy vertex explosions for me

Is it with MoltenVK including https://github.com/KhronosGroup/MoltenVK/pull/2291 or the "current" MoltenVK ?

Perlence commented 1 month ago

No crash, but still heavy vertex explosions for me

Hmm, works fine for me:

Screenshot 2024-07-28 at 21 25 59

MrGcGamer commented 1 month ago

just retested with this build https://github.com/KhronosGroup/MoltenVK/actions/runs/10131781145

still getting vertex explosions from interactive things across the map

(replacing the build under /usr/local/Cellar/molten-vk/1.2.10/lib/ should suffice right?)

log.txt

rtwomey commented 1 month ago

still getting vertex explosions from interactive things across the map

By "vertex explosions" do you mean artifacts appearing all over the screen when there is e.g., a health chest somewhere in the scene? If so, I see this as well when I have basedir set to rerelease and use the "remastered" models in graphics options with The Immortal Lock. If I set this back to "classic" (or just use the original game content) it seems to be fine.

image

MrGcGamer commented 1 month ago

yeah, works fine on classic for me as well.

vsonnier commented 1 month ago

By "vertex explosions" do you mean artifacts appearing all over the screen when there is e.g., a health chest somewhere in the scene? If so, I see this as well when I have basedir set to rerelease and use the "remastered" models in graphics options with The Immortal Lock. If I set this back to "classic" (or just use the original game content) it seems to be fine.

Does it work with remastered models if you either :

MrGcGamer commented 1 month ago

By "vertex explosions" do you mean artifacts appearing all over the screen when there is e.g., a health chest somewhere in the scene? If so, I see this as well when I have basedir set to rerelease and use the "remastered" models in graphics options with The Immortal Lock. If I set this back to "classic" (or just use the original game content) it seems to be fine.

Does it work with remastered models if you either :

* Revert commit [4ec5dd6](https://github.com/Novum/vkQuake/commit/4ec5dd684c31287f08f745334326bc8c177a8cba)

* use 1.30.1 release ?

reverting said commit does not fix the issue. Using the 1.30.1 instant crashes when trying to launch the mod with the error:

ERROR-OUT BEGIN

Shutting down SDL sound

QUAKE ERROR: mod_numknown == MAX_MOD_KNOWN

...I also noticed, that the vertex explosions are also present without the mod on 1.30.1 as well as 1.31.0. (See vid)

https://github.com/user-attachments/assets/bcdbfbab-e988-418b-aa02-ddbecfc28b1d

vsonnier commented 1 month ago

reverting said commit does not fix the issue.

Thanks @MrGcGamer, that is useful info : this was the last commit approaching MD5 models related things, otherwise nothing has changed since 1.30.1 on that matter.

On my side I have none of your problems runing the remaster, (Win10) but truth be told I bought Quake Remaster from GOG and only copied the necessary files for Dimension Of The Machine in my regular Quake dir like so:

id1\pak[0..1].pak (Original release)
mg1\pak0.pak
QuakeEX.kpf

and run it with the command : vkQuake.exe --protocol 999 -particules 32000 -game mg1

Now, if I separately install Quake Enhanced (a.k.a remaster) from GOG and copy all the necessary vkQuake executables in its directory, and run vkQuake.exe --protocol 999 -particules 32000 I don't have the problem either.

For reference, here is my configuration : id1_config.zip

Perlence commented 1 month ago

@MrGcGamer @vsonnier Would it make sense to create a new issue to track vertex explosions? The original issue, which was the game crashing when immortal.bsp was started, is pretty much fixed as far as I can tell. I think starting a new issue would make it easier for the maintainers to track.

vsonnier commented 1 month ago

@Perlence Yeah you are right, all the more that I just published 1.31.1 claiming the crash was solved.

Let's continue the adventure in #719.