Closed iagoCL closed 3 days ago
Change tested on:
VK_EXT_headless_surface
(which is not currently supported natively on NVIDIA GPUs): https://gitlab.freedesktop.org/mesa/vulkan-wsi-layerI have tested 3 different setups, 1. Display connected+GDM3 started, 2. Display connected+GDM3 shutdown and 3. Display disconnected.
Display connected+GDM3 started
The change appears to be functional on my side, I am able to run multiple samples with the --headless
option without encountering errors.
Display connected+GDM3 shutdown The samples will fail with the following error message:
[error] Error Message: vk::Device::acquireNextImageKHR: ErrorInitializationFailed
[error] Failed when running application Swapchain Images
A bit of investigation led me to a failure in the wsi-layer where the usage of vkImportSemaphoreFdKHR when found available fails, leading to the error we see above.
I made a local modification to the layer to bypass that usage and instead use the layer fallback path instead in the following file [...]/vulkan-wsi-layer/wsi/swapchain_base.cpp:L434
. With that modification, I am able to run successfully the samples in this configuration.
Display disconnected I have observed the same behavior as the one in 2. and required the same local modification to the layer to make it function.
My current setup is using Swiftshader https://github.com/google/swiftshader "a high-performance CPU-based implementation of the Vulka 1.3 graphics AP" the headless flag seems to work correctly in all tested examples
I can reproduce SaschaWillems crash. Needs more investigation, but seems to be valid use of the API, and only reproducible if you enable ICDs for Nvidia and SwiftShader. As a workaround you can avoid the crash explicitily selecting the valid GPU using the --gpu flag I also agreed with SaschaWillems to also add some small improvements to the code and extra documentation to better explain VK_EXT_headless_surface, its intention and how it is diferent from rendering without swapchain. I have been a bit busy this last month, but hopefully I will submit a new PR with some improvements soon.
Change the name of the flag and add some comments to better explain it. I also added an explanation in the example in the readme. Based on some feedback I also added a warning when running headless with multiple GPU candidates. Tested on Windows and Linux using Swiftshader.
Looks like this triggers a CI failure due to copyright headers not being up-to-date. Can you fix that?
I did not modified those files, so unsure what trigered the copyright failure.
I just rebased against main, hopefully that fixes the issue
I think I said I was going to test this at the last meeting, but it turns out that we don't actually support VK_EXT_headless_surface, so I 'm afraid I can't.
It seems to work when explicitly selecting swiftshader. I don't see any visual output, which I guess is expected.
But I'm still getting a crash when not explicitly selecting swiftshader. I do get the warning first that tells me to explicitly select a gpu, but if wont' do that, I get the same error as above. Is there no way to avoid running with that flag on GPUs that don't have support for it? Or at least catch that error and exit gracefully. For me it's stuck in an endless loop of exceptions.
And does this really fix #314?
But I'm still getting a crash when not explicitly selecting swiftshader. I do get the warning first that tells me to explicitly select a gpu, but if wont' do that, I get the same error as above. Is there no way to avoid running with that flag on GPUs that don't have support for it? Or at least catch that error and exit gracefully. For me it's stuck in an endless loop of exceptions.
If we have multiple GPUs, the correct way to check which one supports headless is by calling vkGetPhysicalDeviceSurfaceSupportKHR
as the current code id doing. This is causing a crash on Nvidia.
I tried to catch and handle this crash more gratefully, but the crash is in the driver, so exceptions did not work, and I did not found a correct way to detect it from application side.
The alternative is to exit after printing the warning, and forcing user to manually select GPU.
And does this really fix #314?
Not sure.
It mentions instancing
and render_passes
. When I mentioned that issue I just tested those issues, and both work correctly with headless after this change.
Now that I read the problem more carefully, I am confused.
At first glance the current Vulkan Samples will only work on devices that have a present queuee. We check this on get_suitable_gpu
(with or without headless), so it should be imposible for get_queue_by_present
to fail
So the crash is fixed, but we still have the get_device().get_queue_by_present(0).wait_idle()
on submit_frame
. We should remove the wait idle, but that is independent of headless, so maybe it will be better to update #314 or create a new issue for that.
Per discussion, let's merge this with 2 (w/ Steve's approval)
Description
The main objective of this PR is to fix the
--headless
flag. Currently it does not work in the framework, due to not requesting a surface and swap chain.This PR reworks
HeadlessWindow
to make it more similar to the other window classes. This will allow us to enableVK_EXT_headless_surface
inHeadlessWindow::get_required_surface_extensions
like the other windows types. Now we can remove theheadless
paramater fromInstance::Instance
andVulkanSample<>::create_instance
simplifying the framework.To keep the samples
hello_world
andhpp_hello_world
as simple as possible, I did not add support for headless. Instead, an assert is run to ensure the are not enabling with headless in those samples.General Checklist:
Please ensure the following points are checked:
[x] This PR describes the scope and expected impact of the changes I am making
Note: The Samples CI runs a number of checks including:
If this PR contains framework changes:
batch
command line argument to make sure all samples still work properlyFixes #830 Fixes #314