Open smcv opened 11 months ago
A couple of questions:
Would the Vulkan-Loader make use of this list at all? IE, is there any action the loader could/should take in response to the presence of this list? I assume no, but its worth asking.
How does the various frameworks "know" where the manifest files are located? This may be a redundant question, as the framework may be the one giving the Vulkan-Loader the manifest & library. I ask because the loader looks for drivers in certain system paths based on environment variables, predefined system locations, and how it was compiled. In other words, where the loader looks for things is not an easy question to answer, so it leaks into the frameworks.
I am happy for the ICD file to be co-opted in this way with additional information, but to an extent, there is nothing I can do to prevent doing this anyways. I have begun work on creating json schema's for layer & driver manifests, which should make it easier to define exactly what is and isn't an driver manifest. I ask - would this be something that should be included in the schema?
A clarification about driver manifests is that 'relative paths' are relative to the manifest file, so any use of relative paths that isn't relative to the manifest makes for a more confusing manifest. Not super important, but its the only 'quirk' I can think of happening.
How exactly would this new field be tested? wants_libraries
could become out of date, wrong, etc. In which case you are back to square one again.
Dupllicating my comment on the GLVND instance of this issue, since I think this is the better venue to discuss the overall proposal:
Not strongly opposed to this, but given it's a spin-off of the Chrome/pressure-vessel discussion, I want to point out that the combination of this + the Vulkan version won't be sufficient to address what that proposal does. Besides GLX, it doesn't cover CUDA, DLSS, optix, etc. Perhaps these hints combined with a "the other stuff" json file would address the whole problem space, but I'm a little worried about decentralizing the data. E.g., if someone intents API dispatcher
How exactly would this new field be tested?
wants_libraries
could become out of date, wrong, etc. In which case you are back to square one again.
I think you could have tests that run CTS or some other set of test applications, create a very bare container mapping the files specified in to the json into that container, run the same tests in the container and assert the results are the same/pass in both.
How does the various frameworks "know" where the manifest files are located? This may be a redundant question, as the framework may be the one giving the Vulkan-Loader the manifest & library. I ask because the loader looks for drivers in certain system paths based on environment variables, predefined system locations, and how it was compiled. In other words, where the loader looks for things is not an easy question to answer, so it leaks into the frameworks.
Yes, I think this is a good question, and why a top-level json file with its own ordained locations in the filesystem, as proposed in the references, may be an easier lift for container maintainers.
A standard JSON structure in a single standard location might be the easiest, since that way container managers wouldn't need to separately scan manifests from API-specific directories.
As for distinguishing different sets of files, how much granularity do we need? Is something as broad as "graphics" and "compute" sufficient? Would we want to be able to select files based on specific APIs or features (e.g., egl, glx, Vulkan, DLSS, etc.)?
If we only need a couple broad categories like "graphics", then just having separate file lists in the JSON file, or even separate JSON files might be good enough. Any files that are required for both would just be listed in both, and whatever parses the JSON files would be responsible for filtering out duplicates. Something like:
{
"graphics": [
"libraries": [
"libEGL_nvidia.so.0",
"libGLX_nvidia.so.0"
],
"data": [
"/usr/share/glvnd/egl_vendor.d/10_nvidia.json",
"/usr/share/vulkan/icd.d/nvidia_icd.json"
]
],
"compute": [
"libraries": [
"libnvidia-opencl.so.1"
],
"data": [
"/etc/OpenCL/vendors/nvidia.icd"
]
]
}
If we expect to have a lot of categories, though, then having duplicate filenames like that could get pretty unwieldy. In that case, it might be easier to do it the other way around, with a single list of files and then a set of feature tags for each file:
{
"libraries": [
{
"name": "libEGL_nvidia.so.0",
"tags": ["egl"]
},
{
"name": "libGLX_nvidia.so.0",
"tags": ["egl", "vulkan"]
},
{
"name": "libnvidia-opencl.so.1",
"tags": ["compute"]
}
],
"data": [
{
"name": "/usr/share/glvnd/egl_vendor.d/10_nvidia.json",
"tags": ["egl"]
},
{
"name": "/usr/share/vulkan/icd.d/nvidia_icd.json",
"tags": ["vulkan"]
},
{
"name": "/etc/OpenCL/vendors/nvidia.icd",
"tags": ["compute"]
}
]
}
I can certainly see the argument for making this discovery be something that happens "above" Vulkan/EGL/OpenXR/etc., so that new dispatchers that are "the same shape" as Vulkan can take part in this mechanism even if they have nothing to do with Vulkan specifically.
Would the Vulkan-Loader make use of this list at all? IE, is there any action the loader could/should take in response to the presence of this list? I assume no, but its worth asking.
My intention was: no.
How does the various frameworks "know" where the manifest files are located? This may be a redundant question, as the framework may be the one giving the Vulkan-Loader the manifest & library. I ask because the loader looks for drivers in certain system paths based on environment variables, predefined system locations, and how it was compiled. In other words, where the loader looks for things is not an easy question to answer, so it leaks into the frameworks.
In at least pressure-vessel, we already need to know (and duplicate the knowledge of) how and where Vulkan, EGL, etc. loaders look for manifest files, because we already need to be able to:
So this would not be any additional burden for us. Similarly, I would expect that Chromium needs to find and parse the manifests, so that it can find the actual shared libraries, so that it can ensure that they get mirrored into its sandboxed namespace.
A clarification about driver manifests is that 'relative paths' are relative to the manifest file
Yes, that's why I suggested each item in wants_libraries
should be interpreted in a way that is consistent with the library_path
.
There is a difference between plain basenames that don't contain /
(libGLX_nvidia.so.0
) and relative paths that do contain /
(./libGLX_nvidia.so.0
). Plain basenames (or SONAMEs) are looked up in a system-specific search path (on Linux, it involves /etc/ld.so.cache
, $LD_LIBRARY_PATH
, ELF headers and some system-specific quirks, which are another thing that I have to "just know"). Relative paths are interpreted as being relative to the manifest (JSON file). This is quite similar to how Unix shells search PATH
for commands if the command does not contain /
, but interpret commands that do contain /
as being relative to the current working directory.
any use of relative paths that isn't relative to the manifest makes for a more confusing manifest
Oh, I agree completely - that's why I suggested reusing the same interpretation as library_path
.
My stance is that if the changes do not affect how the loader interprets the JSON file, then I have very to say about the changes. Adding additional fields to the manifest is not against the file description.
I'm happy to allow any/all discussion about additional fields in the manifests to occur here, I just wanted to clarify that I don't have a strong stake in these discussions beyond not desiring breaking back-compat.
If there is a strong desire to use a new format, that wouldn't be a decision I get to make unilaterally, as it would have to go through the Vulkan Working Group (specifically the SI subgroup) before any decisions are made. (Not that anyone in this discussion isn't aware of that fact, again I'm just clarifying my position).
What enhancement are you suggesting for the Vulkan Loader? Please describe in detail.
Some frameworks need to use Linux namespaces to run Vulkan programs in a container or sandbox:
It's not always straightforward to know what is considered to be part of the driver. When enumerating Vulkan drivers and layers, we know that we need the
library_path
. However, thelibrary_path
can have dependencies, either by ordinary dynamic linking (ELFDT_NEEDED
on Linux, which we can discover programmatically by parsing ELF headers) or dynamically at runtime (dlopen()
on Linux, which we cannot discover programmatically - currently the only way to know what is needed is to load the driver and let it run its arbitrary code).For Mesa, it's enough to load the Vulkan driver via its
library_path
and then follow theDT_NEEDED
tree; but the Nvidia proprietary driver usesdlopen()
to load parts of itself, so following theDT_NEEDED
tree is not necessarily sufficient. As a result, the Nvidia team have been in contact with Chrome and pressure-vessel developers about providing and parsing a manifest that would tell those tools what other libraries are needed.It occurs to me that for Vulkan and other driver-loaders that mimic its structure (like GLVND EGL) we already have a perfectly good manifest that describes the driver, so it might make sense to put library information into the Vulkan driver's JSON manifest instead of inventing a separate file?
A straw-man example:
The spec for
wants_libraries
could perhaps be something like this:Is this specific to a single platform? I'm personally only interested in this for Linux, but it seems equally applicable to other Unix platforms like *BSD and Hurd, and it doesn't seem as though there's any reason this couldn't be generalized to macOS and Windows too.
Additional context
cc @cubanismo - does this seem like a reasonable solution?