KhronosGroup / Vulkan-Loader

Vulkan Loader
https://vulkan.lunarg.com/doc/sdk/latest/linux/LoaderInterfaceArchitecture.html
Other
511 stars 278 forks source link

vkEnumerateInstanceExtensionProperties is incorrectly implemented #1487

Open DaKvasNV opened 5 months ago

DaKvasNV commented 5 months ago

Environment:

Describe the Issue vkEnumerateInstanceExtensionProperties is a "pre-instance function" (https://vulkan.lunarg.com/doc/view/1.3.280.0/windows/loader_and_layer_interface.html#user-content-pre-instance-functions), and therefore requires special handling via Vk...Chain in order to be properly layered. Because this is not properly implemented, VVL's EnumerateInstanceExtensionProperties() function does not appear to ever get executed.

For example, calling: vkEnumerateInstanceExtensionProperties(NULL, NULL, NULL) will not report the VUID-vkEnumerateInstanceExtensionProperties-pPropertyCount-parameter validation error.

It's also worth noting that if an app calls vkEnumerateInstanceExtensionProperties() prior to vkCreateInstance(), the VkLayer_khronos_validation.dll does not get loaded, as would be expected if the function was correctly layered.

Lastly, the chassis.cpp::EnumerateInstanceExtensionProperties() implementation does not correctly handle the case when pLayerName == NULL

This set of issues likely applies to the other "pre-instance functions".

Expected behavior

  1. Calls to vkEnumerateInstanceExtensionProperties() prior to vkCreateInstance() would load VkLayer_khronos_validation.dll and run validation checks
  2. Calling vkEnumerateInstanceExtensionProperties(NULL, NULL, NULL) should report VUID-vkEnumerateInstanceExtensionProperties-pPropertyCount-parameter validation error.
  3. Calling vkEnumerateInstanceExtensionProperties(NULL, &count, NULL) should result in VVL calling down the chain so that extensions provided by the Vulkan implementation or by implicitly enabled layers are returned (per vkEnumerateInstanceExtensionProperties's spec)

Valid Usage ID VUID-vkEnumerateInstanceExtensionProperties-* don't get reported

Additional context It's worth noting that according to the vulkan spec, VVL is not actually allowed layer the pre-instance functions: In order to intercept the pre-instance functions, several conditions must be met: ... The layer must be implicit. I'm not sure how VVL is expected to reconcile that, but thought it would be worth mentioning.

spencer-lunarg commented 5 months ago

For example, calling: vkEnumerateInstanceExtensionProperties(NULL, NULL, NULL) will not report the VUID-vkEnumerateInstanceExtensionProperties-pPropertyCount-parameter validation error.

So the issue here is we have not setup the anything and we have no Logging System created yet because we do that at vkCreateInstance time ... I would personally expect the Loader to catch this and return an error as it would be useful outside of just the VVL layer as well (@charles-lunarg thoughts)

spencer-lunarg commented 5 months ago

vkEnumerateInstanceExtensionProperties(NULL, NULL, NULL)

So this will crash in terminator_EnumerateInstanceExtensionProperties and never make it to the Layer, so there is nothing VVL could do about this

Another issue is the Validation Layers are following the Loader-Layer Interface for

image


I don't disagree that it would be helpful for some error message to occur, but I would expect that to happen in the loader as it has all the information. There is also nothing stopping another Loader implementation from doing the same

DaKvasNV commented 5 months ago

vkEnumerateInstanceExtensionProperties(NULL, NULL, NULL)

I was really just using the VUIDs as an example. The main problem is that the VVL implementation is not properly implementing the "pre-instance functions" mechanism for this API, so the implementation ends up being, in my opinion, useless, because it never gets called into by the layer chain. I think it would be better to not layer this API at all if the Chain mechanism isn't used.

vkEnumerateInstanceExtensionProperties must handle the case where pLayerName is itself.
It must return VK_ERROR_LAYER_NOT_PRESENT otherwise, including when pLayerName is NULL.

This seems like a contradictory requirement for the loader layer interface. Because to me it seems like the implementation shouldn't CallDown beyond its own implementation, given that it's only returning it's own information or VK_ERROR_LAYER_NOT_PRESENT. Yet further in the document, the InterceptFunctionName example is using vkEnumerateInstanceExtensionProperties to showcase how to CallDown.

So I do think the loader doc could use some clarification on what is supposed to happen with this function, but either way I don't think what VVL is currently doing is going to be correct.

charles-lunarg commented 5 months ago

I feel that an important detail was previously assumed by layer/loader maintainers but was never stated explicitly here or in the documentation is that "the validation layer doesn't validate pre-instance functions". This was a deliberate design consequence of making pre-instance interceptions of functions a implicit-only layer behavior. I am very much convinced that making the validation layer intercept pre-instance functions for the sake of validation was never discussed because:

  1. Very few VUID's to cover
  2. When the app doesn't follow them, they (almost without fail) crash - making it extremely clear something is wrong
  3. Making validation into an implicit layer means its no longer an optional layer, IE it would always be loaded if present. That means validation needs to only validate when its 'active', adding a lot of conditional branching to the implementation.

I am confident that the validation layer will not be adding validation for pre-instance functions. So much so that I was the person who asked spencer to move the issue over to the loader repo, where such validation can occur.

In an effort to make a proper issue over this, here is the list of VUID's that should be in the loader.

Some of them are trivial (and may already be implemented) as they are NULL pointer checks. That said, a non-null pointer can be invalid and there is no consistent way for the loader to validate that. Similarly, pLayerName must be a utf-8 null terminated string, and the loader has no reasonable way to checking that a string is utf-8 (it can do a null terminator check though).

I am currently out of town so am not able to work through the source code, see which VUID's are missing, and add them + tests. I should note that the loader has made the design choice to abort whenever a fatal condition is met (like a nullptr deref). It does print out the appropriate VUID error message to stderr beforehand, so that the developer may hopefully see it.

About the loader doc having contradictory statements, like:

vkEnumerateInstanceExtensionProperties must handle the case where pLayerName is itself. It must return VK_ERROR_LAYER_NOT_PRESENT otherwise, including when pLayerName is NULL.

I will need to review the git history of the document to see if there is a reason behind this statement. Some of it comes from working group/specification, but some may be an out of date/innacurate (ie, later changes invalidate the statement but were never revised). Again, I am not able to give this my full intention. (but did want to make sure that I have seen this issue and reassure that it is going to be looked into)

DaKvasNV commented 5 months ago

I feel that an important detail was previously assumed by layer/loader maintainers but was never stated explicitly here or in the documentation is that "the validation layer doesn't validate pre-instance functions"

I was not aware of this, but that would be fine for my use-case. However, I don't think implementing this is particularly meaningful:

vkEnumerateInstanceExtensionProperties must handle the case where pLayerName is itself. It must return VK_ERROR_LAYER_NOT_PRESENT otherwise, including when pLayerName is NULL.

I don't really know when the first condition could be useful, and the second condition breaks my use-case. Instead, I think it would be better for VVL to simply ignored vkEnumerateInstanceExtensionProperties.

The use-case I'm hitting is that we're trying to call vkEnumerateInstanceExtensionProperties from within an internal-layer call to vkCreateInstance, so that we can make decisions on whether our internal-layer can operate. After the fix to https://github.com/KhronosGroup/Vulkan-Loader/issues/1477, this flow now works, iff VVL is not used. If VVL is enabled, then it intercepts the vkEnumerateInstanceExtensionProperties call, and returns VK_ERROR_LAYER_NOT_PRESENT.