mamedev / mame

MAME
https://www.mamedev.org/
Other
8.38k stars 2.04k forks source link

`3rdparty/bgfx/src/renderer_vk.cpp: runtime error: index 2 out of bounds for type 'bgfx::vk::Layer[2]'` #10492

Open firewave opened 2 years ago

firewave commented 2 years ago

This happens just starting MAME without any additional options or sets specified on Kali Linux with a Clang 14 UBSAN build.

../../../../../3rdparty/bgfx/src/renderer_vk.cpp:352:8: runtime error: index 2 out of bounds for type 'bgfx::vk::Layer[2]'
    #0 0x7f2e10a6f842 in bgfx::vk::updateExtension(char const*, unsigned int, bool, bgfx::vk::Extension*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:352:35
    #1 0x7f2e10a7144b in bgfx::vk::dumpExtensions(VkPhysicalDevice_T*, bgfx::vk::Extension*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:684:23
    #2 0x7f2e10ab1b8d in bgfx::vk::RendererContextVK::init(bgfx::Init const&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:1165:5
    #3 0x7f2e10a72e6e in bgfx::vk::rendererCreate(bgfx::Init const&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:4429:20
    #4 0x7f2e109c9fd8 in bgfx::rendererCreate(bgfx::Init const&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2746:16
    #5 0x7f2e109ce1a6 in bgfx::Context::rendererExecCommands(bgfx::CommandBuffer&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2797:20
    #6 0x7f2e109c38fe in bgfx::Context::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2431:5
    #7 0x7f2e109c37e3 in bgfx::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:1475:38
    #8 0x7f2e109ea0a9 in bgfx::Context::renderThread(bx::Thread*, void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx_p.h:3012:35
    #9 0x7f2e10c1f370 in bx::Thread::entry() /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:325:20
    #10 0x7f2e10c1f224 in bx::ThreadInternal::threadFunc(void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:92:20
    #11 0x7f2dcc377849 in start_thread nptl/./nptl/pthread_create.c:442:8
    #12 0x7f2dcc3fa52f in __clone misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:100

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../../../3rdparty/bgfx/src/renderer_vk.cpp:352:8 in
../../../../../3rdparty/bgfx/src/renderer_vk.cpp:1203:35: runtime error: index 2 out of bounds for type 'bgfx::vk::Layer[2]'
    #0 0x7f2e10ab25d4 in bgfx::vk::RendererContextVK::init(bgfx::Init const&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:1203:62
    #1 0x7f2e10a72e6e in bgfx::vk::rendererCreate(bgfx::Init const&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:4429:20
    #2 0x7f2e109c9fd8 in bgfx::rendererCreate(bgfx::Init const&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2746:16
    #3 0x7f2e109ce1a6 in bgfx::Context::rendererExecCommands(bgfx::CommandBuffer&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2797:20
    #4 0x7f2e109c38fe in bgfx::Context::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2431:5
    #5 0x7f2e109c37e3 in bgfx::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:1475:38
    #6 0x7f2e109ea0a9 in bgfx::Context::renderThread(bx::Thread*, void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx_p.h:3012:35
    #7 0x7f2e10c1f370 in bx::Thread::entry() /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:325:20
    #8 0x7f2e10c1f224 in bx::ThreadInternal::threadFunc(void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:92:20
    #9 0x7f2dcc377849 in start_thread nptl/./nptl/pthread_create.c:442:8
    #10 0x7f2dcc3fa52f in __clone misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:100

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../../../3rdparty/bgfx/src/renderer_vk.cpp:1203:35 in
../../../../../3rdparty/bgfx/src/renderer_vk.cpp:353:8: runtime error: index 2 out of bounds for type 'bgfx::vk::Layer[2]'
    #0 0x7f2e10a6f8fd in bgfx::vk::updateExtension(char const*, unsigned int, bool, bgfx::vk::Extension*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:353:35
    #1 0x7f2e10a7144b in bgfx::vk::dumpExtensions(VkPhysicalDevice_T*, bgfx::vk::Extension*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:684:23
    #2 0x7f2e10ab4ab9 in bgfx::vk::RendererContextVK::init(bgfx::Init const&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:1423:6
    #3 0x7f2e10a72e6e in bgfx::vk::rendererCreate(bgfx::Init const&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:4429:20
    #4 0x7f2e109c9fd8 in bgfx::rendererCreate(bgfx::Init const&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2746:16
    #5 0x7f2e109ce1a6 in bgfx::Context::rendererExecCommands(bgfx::CommandBuffer&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2797:20
    #6 0x7f2e109c38fe in bgfx::Context::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2431:5
    #7 0x7f2e109c37e3 in bgfx::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:1475:38
    #8 0x7f2e109ea0a9 in bgfx::Context::renderThread(bx::Thread*, void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx_p.h:3012:35
    #9 0x7f2e10c1f370 in bx::Thread::entry() /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:325:20
    #10 0x7f2e10c1f224 in bx::ThreadInternal::threadFunc(void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:92:20
    #11 0x7f2dcc377849 in start_thread nptl/./nptl/pthread_create.c:442:8
    #12 0x7f2dcc3fa52f in __clone misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:100

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../../../3rdparty/bgfx/src/renderer_vk.cpp:353:8 in

There are no issues running any games and also no visual issues.

This also only happens on the first start-up.

firewave commented 2 years ago

There's also the following later on:

../../../../../3rdparty/bgfx/src/renderer_vk.cpp:4060:24: runtime error: load of value 18, which is not a valid value for type 'UniformType::Enum'
    #0 0x7f0b820b4825 in bgfx::vk::RendererContextVK::commit(bgfx::UniformBuffer&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:4060:24
    #1 0x7f0b82088a39 in bgfx::vk::RendererContextVK::submit(bgfx::Frame*, bgfx::ClearQuad&, bgfx::TextVideoMemBlitter&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:8575:9
    #2 0x7f0b81faba19 in bgfx::Context::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2438:19
    #3 0x7f0b81fab7e3 in bgfx::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:1475:38
    #4 0x7f0b81fd20a9 in bgfx::Context::renderThread(bx::Thread*, void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx_p.h:3012:35
    #5 0x7f0b82207370 in bx::Thread::entry() /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:325:20
    #6 0x7f0b82207224 in bx::ThreadInternal::threadFunc(void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:92:20
    #7 0x7f0b3d957849 in start_thread nptl/./nptl/pthread_create.c:442:8
    #8 0x7f0b3d9da52f in __clone misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:100

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../../../3rdparty/bgfx/src/renderer_vk.cpp:4060:24 in

../../../../../3rdparty/bgfx/src/renderer_vk.cpp:4096:32: runtime error: load of value 18, which is not a valid value for type 'UniformType::Enum'
    #0 0x7f0b820b487f in bgfx::vk::RendererContextVK::commit(bgfx::UniformBuffer&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:4096:32
    #1 0x7f0b82088a39 in bgfx::vk::RendererContextVK::submit(bgfx::Frame*, bgfx::ClearQuad&, bgfx::TextVideoMemBlitter&) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/renderer_vk.cpp:8575:9
    #2 0x7f0b81faba19 in bgfx::Context::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:2438:19
    #3 0x7f0b81fab7e3 in bgfx::renderFrame(int) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx.cpp:1475:38
    #4 0x7f0b81fd20a9 in bgfx::Context::renderThread(bx::Thread*, void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bgfx/src/bgfx_p.h:3012:35
    #5 0x7f0b82207370 in bx::Thread::entry() /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:325:20
    #6 0x7f0b82207224 in bx::ThreadInternal::threadFunc(void*) /mnt/s/GitHub/mame/build/projects/sdl/mame/gmake-linux-clang/../../../../../3rdparty/bx/src/thread.cpp:92:20
    #7 0x7f0b3d957849 in start_thread nptl/./nptl/pthread_create.c:442:8
    #8 0x7f0b3d9da52f in __clone misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:100

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../../../3rdparty/bgfx/src/renderer_vk.cpp:4096:32 in
galibert commented 2 years ago

Hmmm, we try not to change things in our local copies of 3rd party libs. Can you try to report that to https://github.com/bkaradzic/bgfx ? If there's no reasult we'll have to have a deeper look though.

firewave commented 2 years ago

I did file an upstream ticket as well which is not being referenced for some reason.

Grabbed this from my browser history: https://github.com/bkaradzic/bgfx/issues/2966. "This issue has been deleted." - wtf?

ghost commented 2 years ago

I think it might be time to drop bgfx, this isn't the first time I've heard of issues not being taken seriously

firewave commented 2 years ago

I expected some pushback for reporting an issue based on a year-old revision and there were also some details missing in the report (I had not figured out how to get the debug output yet) but this is ridiculous (I wasn't even aware it is possible to delete issues).

MooglyGuy commented 2 years ago

I think it might be time to drop bgfx, this isn't the first time I've heard of issues not being taken seriously

Sure thing, let me know when you have the time to implement a rendering abstraction layer that can elide the implementation details and shader pipeline details involving Direct3D 9, Direct3D 11, Direct3D 12, Vulkan, OpenGL, and Metal.

MooglyGuy commented 2 years ago

I expected some pushback for reporting an issue based on a year-old revision and there were also some details missing in the report (I had not figured out how to get the debug output yet) but this is ridiculous (I wasn't even aware it is possible to delete issues).

While I don't necessarily agree with @bkaradzic deleting the issue outright, I'm going to point out something here that for whatever reason nobody on the MAME team proper has said yet: You're creating issues not from any robust technical standpoint, but more or less "'Cos this tool says so."

Many of the issues that your runs of UBSAN are turning up are valid: Blowing out array-bounds access is generally not a good thing.

Some of the issues however, aren't valid. Either because they involve an upstream library dependency which might not be reflective of the current state of that library, or because an automated static analysis isn't a stand-in for a coder who has both the context and capability to analyze the validity of the issues being flagged by that tool.

If you yourself don't have the capacity to act on the output of a given tool, then I'm sorry to say, you probably shouldn't be running that tool, and you shouldn't be blithely creating issues for other people based on the output of that tool.

Static analysis tools are powerful and useful when they're in the hands of - let's say - the code lead on a project, because that code lead has both the context and the skill necessary to know when that static analysis tool is correct, and when that static analysis tool is producing results that can safely be disregarded. If you don't have the commensurate context to make the go/no-go call on a given result from that tool, then you shouldn't be using it.

firewave commented 2 years ago

While I don't necessarily agree with @bkaradzic deleting the issue outright, I'm going to point out something here that for whatever reason nobody on the MAME team proper has said yet: You're creating issues not from any robust technical standpoint, but more or less "'Cos this tool says so."

It's called ghosting.

Some of the issues however, aren't valid. Either because they involve an upstream library dependency which might not be reflective of the current state of that library, or because an automated static analysis isn't a stand-in for a coder who has both the context and capability to analyze the validity of the issues being flagged by that tool.

Well, that's a common problem in software development with fixed external dependencies (the request "please try the latest version" on bugs which need to be reported upstream is next to impossible when using a regular Linux distro). bgfx not having point releases is not helping either. But I can't remember (m)any case(s) beside this were an (outdated) external library was involved.

As I mentioned I was still figuring out how to get additional information on this bug in particular.

I also provide the context on how to reproduce issues. As it is usually just "run it without any options" but I don't explicitly state that.

If you yourself don't have the capacity to act on the output of a given tool, then I'm sorry to say, you probably shouldn't be running that tool, and you shouldn't be blithely creating issues for other people based on the output of that tool.

I do provide a short analysis of the actual issue or context if I am able to understand the code. It is mostly not included with the original report though and added as comment afterwards.

See:

10502

https://mametesters.org/view.php?id=6830 https://mametesters.org/view.php?id=6829 https://mametesters.org/view.php?id=8493 https://mametesters.org/view.php?id=8489 https://mametesters.org/view.php?id=8500 https://mametesters.org/view.php?id=8495 https://mametesters.org/view.php?id=5670 https://mametesters.org/view.php?id=5665 https://mametesters.org/view.php?id=6835 https://mametesters.org/view.php?id=6834 https://mametesters.org/view.php?id=7543 etc.

I do refrain from providing fixes myself for code which requires technical knowledge of actual emulation since I do lack that almost completely.

MooglyGuy commented 2 years ago

I do appreciate that you're pointing out these issues when they come to MAME itself, but 3rd-party libraries are something that the team is broadly not responsible for, other than keeping them up-to-date.

I agree that the lack of a manageable versioning system for BGFX is something that can be improved on that end, but it's not the MAME team's cross to bear, and generally the API provided by BGFX and BX is stable enough that updating the library is relatively straightforward.

It does what it says on the tin and has always been fit for purpose, so beyond the notoriously terrible situation with Vulkan drivers on Linux - something that neither the MAME team nor Branimir are responsible for regardless - I'm not sure anyone even imagined there were any BGFX-related issues. I find @DavidHaywood's "throw the baby out with the bathwater" suggestion even more astonishing in that light.

bkaradzic commented 2 years ago

The reason it's deleted is actually explained in issue template:

See: https://github.com/bkaradzic/bgfx/blob/master/.github/ISSUE_TEMPLATE/bug_report.md

name: Bug report (FOR BUGS ONLY! DO NOT SUBMIT QUESTIONS HERE!)
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

<!---
###########################################################################

**BEWARE!**

If you completely disregard this guidance, and submit random question as issue
(bug report), the issue will be deleted. If you submit the same issue after
deletion, you'll be blocked.

**IMPORTANT: READ FIRST!**

Issue tracker is **ONLY** used for reporting bugs.

If you have any questions please **do not** open issue, rather:

 - If you have quick question, ask on Discord: https://discord.gg/9eMbv7J
 - Ask in discussions: https://github.com/bkaradzic/bgfx/discussions
 - Search documentation: https://bkaradzic.github.io/bgfx/

New features should be discussed on:

 - GitHub Discussions https://github.com/bkaradzic/bgfx/discussions
 - Discord Chat https://discord.gg/9eMbv7J

###########################################################################

I get daily random junk questions or stuff as @MooglyGuy explains well:

You're creating issues not from any robust technical standpoint, but more or less "'Cos this tool says so."

Did something about bgfx crashed? Or you ran some random tool, got output and just pasted into issue? I could just leave issue there, not do anything. By deleting it I signal that I'm not going to do anything about it.

If you guys here really care about ubsan of bgfx, it would be beneficial to come to Discord/Discussions and we could discuss merits of having it.

For example thing complains for stuff like this (single block over-allocates, but access outside of defined "fixed" array): https://github.com/bkaradzic/bgfx/blob/6a03a1ec5e5d8630d875a3a1dddb5b4fb76921a7/src/bgfx_p.h#L1481-L1488

Because it thinks that will access this "fixed" block out of bounds: https://github.com/bkaradzic/bgfx/blob/6a03a1ec5e5d8630d875a3a1dddb5b4fb76921a7/src/bgfx_p.h#L1603

But it's not a bug. And it would be great if some sanitizer expert from here could guide us how to annotate code... But that would be contribution to bgfx, not tossing cross project issue at us.

firewave commented 2 years ago

I do appreciate that you're pointing out these issues when they come to MAME itself, but 3rd-party libraries are something that the team is broadly not responsible for, other than keeping them up-to-date.

That's why I immediately filed an upstream issue. This issue looks like an upstream one but it could easily be an configuration or usage issue.

To be fair - MAME ships with the whole source in its tree and AFAIK there's no way to easily point it to a system provided version so there's at least a bit of responsibility on the MAME side.

firewave commented 2 years ago

Coincidently it seems there just came up a bgfx issue on Linux: https://mametesters.org/view.php?id=8505. Probably not related to this finding though.

MooglyGuy commented 2 years ago

You're correct, and it's also not a valid ticket and I'm going to push for it to be closed.

Quite frankly, I'm tired of hearing from people who are apparently smart enough to run Linux, and run MAME, but don't have enough brain cells to rub together to clock that the Vulkan driver situation on Linux has been complete dogshit since day one of Khronos Group ratifying the Vulkan spec.

It's not a MAME issue, and it's not even a BGFX issue. It's incompetence on the part of Nvidia, AMD, or Mesa contributors producing a driver stack that falls over if exposed to a slight breeze. I'm tired of hearing about it to the point that I'm considering making OpenGL the default backend that BGFX uses rather than Vulkan, despite OpenGL being ostensibly deprecated.

With the amount of grief that I've copped from Linux people with bad Vulkan support in their GPU drivers, I can only imagine that @bkaradzic has it ten times worse, and it's a testament to his infinite patience that he hasn't simply given up entirely on GPU coding and taken up woodworking or sheep farming instead.

firewave commented 2 years ago

@bkaradzic

Did something about bgfx crashed? Or you ran some random tool, got output and just pasted into issue? I could just leave issue there, not do anything. By deleting it I signal that I'm not going to do anything about it.

It was more than that. And as mentioned here I was still looking into getting more output.

I also took a look at the code to spot the issue but could not wrap my head around it.

I just took another look and spotted the obvious error:

From 3rdparty/bgfx/src/renderer_vk.cpp

for (uint32_t ii = 0; ii < Extension::Count; ++ii)
{
...
const Extension& extension = s_extension[ii];
const LayerInfo& layerInfo = s_layer[extension.m_layer].m_instance;
...
}
static Extension s_extension[] =
{
    { "VK_EXT_debug_utils",                     1, false, false, BGFX_CONFIG_DEBUG_OBJECT_NAME || BGFX_CONFIG_DEBUG_ANNOTATION, Layer::Count },
...
}
    static Layer s_layer[] =
    {
        { "VK_LAYER_LUNARG_standard_validation", 1, { false, false }, { false, false } },
        { "VK_LAYER_KHRONOS_validation",         1, { false, false }, { false, false } },
    };
    BX_STATIC_ASSERT(Layer::Count == BX_COUNTOF(s_layer) );

So Layer::Count is 2 since s_layer has two entries. That actual value is stored in s_extension to access s_layer which is obviously out-of-bounds.

I get daily random junk questions or stuff as @MooglyGuy explains well:

Sorry to be blunt but it was neither a question nor junk - it was a valid pointer to a tooling-based detection of junk code...albeit based on an outdated codebase - which I even mentioned in my ticket. But I just checked the latest code and the issue is still present.

firewave commented 2 years ago

It's not a MAME issue, and it's not even a BGFX issue. It's incompetence on the part of Nvidia, AMD, or Mesa contributors producing a driver stack that falls over if exposed to a slight breeze. I'm tired of hearing about it to the point that I'm considering making OpenGL the default backend that BGFX uses rather than Vulkan, despite OpenGL being ostensibly deprecated.

In my case it is not using any drivers at all. It is using swrast and lavapipe which are software implementations in mesa and not vendor-specific drivers. And it only happens with bgfx and with certain MAME drivers so it might depend on what is being drawn.

bkaradzic commented 2 years ago

Coincidently it seems there just came up a bgfx issue on Linux: https://mametesters.org/view.php?id=8505.

By reading those AddressSanitizer showing two completely different things in VK driver (failure somewhere in initialization inside driver) vs GL driver (failure somewhere in texture update in driver).

When you see driver involved instead running AddressSanitizer on application, way better approach would be to upgrade/downgrade driver to quickly identify at which version of driver it started happening. Also instead testing on random drivers, you should have a few reference versions of drivers you previously tested on, and know they are not causing issues.

bkaradzic commented 2 years ago

Just this discussion shows why deleting the issue is justified. You don't know what's going on, you passed some things sanitizer reported. I could spend days setting up sanitizer, fixing issue sanitizer reports, and not solve your issue at all, nor work on more important things for bgfx.

Issues like this are kind of DDoS attack on open source project.

MooglyGuy commented 2 years ago

In my case it is not using any drivers at all. It is using swrast and lavapipe which are software implementations in mesa and not vendor-specific drivers. And it only happens with bgfx and with certain MAME drivers so it might depend on what is being drawn.

So go bother the Mesa developers, then.

Edit To Add: I even mentioned Mesa in the part that you quote-replied to. I personally welcome UBSAN results on the MAME codebase itself, which would broadly be the contents of src/, leaving 3rdparty/ out of it.

Developers of upstream libraries are free to investigate potential issues or not, but that's not the responsibility of the MAME team. If there's a UBSAN issue flagged in the current version of a given 3rd-party library that is fixed in a later revisions, then that merits its own Github issue on the MAME project, per outdated 3rd-party library. At the end of the day, though, if your goal is for MAME to improve - it seems like it is, and I support that - then the best way to do it is by focusing on the areas that the MAME team has the most direct control over.

bkaradzic commented 2 years ago

@firewave

So Layer::Count is 2 since s_layer has two entries. That actual value is stored in s_extension to access s_layer which is obviously out-of-bounds.

If you change code to have 3rd entry, does it solves your issue with VK driver?

Fix would be something like:

    // Layer registry
    //
    static Layer s_layer[] =
    {
        { "VK_LAYER_LUNARG_standard_validation", 1, { false, false }, { false, false } },
        { "VK_LAYER_KHRONOS_validation",         1, { false, false }, { false, false } },
        { "",                                    0, { false, false }, { false, false } },
    };
    BX_STATIC_ASSERT(Layer::Count == BX_COUNTOF(s_layer)-1);
bkaradzic commented 2 years ago

That sanitizer issue is actually false positive: https://github.com/bkaradzic/bgfx/blob/90e847c46d8657e5fb3867b2b2f81b5013aab5f8/src/renderer_vk.cpp#L389

While it does get address out of bounds, read is prevented further down:

        Extension& extension = _extensions[ii];
        const LayerInfo& layerInfo = _instanceExt  <<< gets out-of-bounds address
            ? s_layer[extension.m_layer].m_instance
            : s_layer[extension.m_layer].m_device
            ;
        if (!extension.m_supported
        &&   extension.m_initialize
        &&  (extension.m_layer == Layer::Count || layerInfo.m_supported) ) <<< read prevented

Also I applied fix that should quiet ubsan in that particular case: https://github.com/bkaradzic/bgfx/commit/90e847c46d8657e5fb3867b2b2f81b5013aab5f8

firewave commented 2 years ago

If you change code to have 3rd entry, does it solves your issue with VK driver?

Yes, it does.

Regarding the other UBSAN errors about the enum value - those can be disregarded. I checked how you use the enum (bitfield) and it is fine. I came across exactly the same a few years ago in my own code. That is the kind of sanitizer noise you correctly referred to.

While it does get address out of bounds, read is prevented further down:

I would only consider that fine as long as the memory behind that array is addressable (it was "only" an UBSAN error after all - so it got to be). But it is quite possible there is something in the language spec which makes this code absolutely valid. It is an interesting case. I will file an upstream ticket/question about this.

This code in the other case where it is access (line 1237 i the latest version) looks weird though:

                    const bool layerEnabled = false
                        || extension.m_layer == Layer::Count
                        || (layerInfo.m_supported && layerInfo.m_initialize)
                        ;

The false does nothing. Guess it's a left-over.

bkaradzic commented 2 years ago

@firewave false there is so that next line can start with ||. It's bgfx code style, stuff like false\n|| or true\n&& is everywhere...

firewave commented 2 years ago

FYI if you hard-code the index even the compiler will raise a warning:

#include <cstdint>

struct LayerInfo
{
    bool m_supported;
    bool m_initialize;
};

struct Layer
{
    const char* m_name;
    uint32_t m_minVersion;
    LayerInfo m_instance;
    LayerInfo m_device;
};

static Layer s_layer[] =
{
    { "VK_LAYER_LUNARG_standard_validation", 1, { false, false }, { false, false } },
    { "VK_LAYER_KHRONOS_validation",         1, { false, false }, { false, false } },
};

static bool f(int i)
{
    const Layer& l = s_layer[2];
    return i == 2 || (l.m_instance.m_supported && l.m_instance.m_initialize);
}

int main()
{
    return f(2) ? 1 : 0;
}
test.cpp:25:19: warning: array index 2 is past the end of the array (which contains 2 elements) [-Warray-bounds]
        const Layer& l = s_layer[2];
                         ^       ~
test.cpp:17:1: note: array 's_layer' declared here
static Layer s_layer[] =

And if I make the index big enough (e.g. 1000) it will even crash. So it is obviously not a false positive.

bkaradzic commented 2 years ago

And if I make the index big enough (e.g. 1000) it will even crash. So it is obviously not a false positive.

It's is false positive.

2 is in code expressed as Layer::Count, it's regulated by enum, and there is no code that casts 1000 to Layer::Enum

Code looks like:

#include <stdint.h>
#include <stdio.h>

struct LayerInfo
{
    bool m_supported;
    bool m_initialize;
};

struct Layer
{
    enum Enum
    {
        VK_LAYER_LUNARG_standard_validation,
        VK_LAYER_KHRONOS_validation,

        Count
    };

    const char* m_name;
    uint32_t m_minVersion;
    LayerInfo m_instance;
    LayerInfo m_device;
};

static Layer s_layer[] =
{
    { "VK_LAYER_LUNARG_standard_validation", 1, { false, false }, { false, false } },
    { "VK_LAYER_KHRONOS_validation",         1, { false, false }, { false, false } },
};

static bool f(Layer::Enum _layer)
{
    const Layer& l = s_layer[_layer];
    return _layer == Layer::Count /* if this is true, || will be ignored */
        || (l.m_instance.m_supported && l.m_instance.m_initialize)
        ;
}

int main()
{
    printf("%d\n", f(Layer::Count));
    return 0;
}
bkaradzic commented 2 years ago

Yes, it does.

To be clear... This fix above, fixes crash in driver you're seeing? Or it just quiet sanitizer warnings?

bkaradzic commented 2 years ago

Ping? Is this resolved now?

firewave commented 1 year ago

To be clear... This fix above, fixes crash in driver you're seeing? Or it just quiet sanitizer warnings?

Sorry for the late reply.

It does fix the sanitizer warning only which was happening always. That's what this issue was about. The issue does need to remain open until bgfx has been updated in this repo though.

The crashes happen with specific drivers only and have different causes. They are also tracked in different tickets and are not the within scope here. We just got off-topic.