Currently in the "official" Khronos desktop Vulkan loader documentation there is the following note about unknown_ext_chain.c (along with some other generated files I believe) relying on tail call optimization being enabled for the loader to behave correctly:
Platforms not listed will use a fallback C Code path that relies on tail-call optimization to work. No guarantees are made about the use of the fallback code paths.
I wanted to let you know about one possible oversight in the current code, along with a potential robustness improvement possible with clang.
First, some files (dev_ext_trampoline.c, phys_dev_ext.c) currently rely on #pragma GCC optimize(3) to ensure tail-call optimization, but this isn't currently present in unknown_ext_chain.c. Since forcing O3 is LunarG's currently preferred method for forcing tail-call optimization, it seems like this should be applied to unknown_ext_chain.c as well.
Second, I wanted to make sure you were aware of the relatively new [[clang::musttail]] attribute that can be applied to return statements to force tail-call optimization regardless of the optimization level at specific call sites. I think that if practical this is probably the most "correct" possible way to enforce the loader's current requirements for tail-call optimization to be enabled in specific places, and its use would have an added code understandability/maintainability benefit of directly marking exactly which parts of the loader require tail-call optimization in order to function correctly. I believe that this would help avoid future potential oversights and/or situations where building parts of the loader relies on undocumented "tribal knowledge".
I hope that some of this information can be helpful.
Currently in the "official" Khronos desktop Vulkan loader documentation there is the following note about unknown_ext_chain.c (along with some other generated files I believe) relying on tail call optimization being enabled for the loader to behave correctly:
I wanted to let you know about one possible oversight in the current code, along with a potential robustness improvement possible with clang.
First, some files (dev_ext_trampoline.c, phys_dev_ext.c) currently rely on #pragma GCC optimize(3) to ensure tail-call optimization, but this isn't currently present in unknown_ext_chain.c. Since forcing O3 is LunarG's currently preferred method for forcing tail-call optimization, it seems like this should be applied to unknown_ext_chain.c as well.
Second, I wanted to make sure you were aware of the relatively new [[clang::musttail]] attribute that can be applied to return statements to force tail-call optimization regardless of the optimization level at specific call sites. I think that if practical this is probably the most "correct" possible way to enforce the loader's current requirements for tail-call optimization to be enabled in specific places, and its use would have an added code understandability/maintainability benefit of directly marking exactly which parts of the loader require tail-call optimization in order to function correctly. I believe that this would help avoid future potential oversights and/or situations where building parts of the loader relies on undocumented "tribal knowledge".
I hope that some of this information can be helpful.