Open jeremyg-lunarg opened 2 months ago
I think I see what is happening. Here's the memory export to fd 75:
{
"index": 4255,
"function": {
"name": "vkGetMemoryFdKHR",
"thread": 1,
"return": "VK_SUCCESS",
"args": {
"device": 7,
"pGetFdInfo": {
"sType": "VK_STRUCTURE_TYPE_MEMORY_GET_FD_INFO_KHR",
"memory": 585,
"handleType": "VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT",
"pNext": null
},
"pFd": 75
}
}
},
Then the import happens to fd 76:
{
"index": 4261,
"function": {
"name": "vkAllocateMemory",
"thread": 1,
"return": "VK_SUCCESS",
"args": {
"device": 38,
"pAllocateInfo": {
"sType": "VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO",
"allocationSize": 1048576,
"memoryTypeIndex": 0,
"pNext": {
"sType": "VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR",
"handleType": "VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT",
"fd": 76,
"pNext": {
"sType": "VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO",
"image": 586,
"buffer": 0,
"pNext": null
}
}
},
"pAllocator": null,
"pMemory": 587
}
}
},
I think the problem is in the replay where fd 75 is valid but fd 76 is not. The chromium code includes this:
descriptor.memoryFD = dup(memory_fd_.get());
That is most likely what makes fd 76 point to the same external memory as fd 75.
I'm guessing that gfxreconstruct doesn't call dup()
and it doesn't really need to unless it wants to keep some control of the fd (which chromium apparently does). So it seems like getting this application to work would require recording dup()
and probably some other system calls to know when this is happening.
Describe the replay bug: This is a replay of webgpu content running on linux with the RADV driver. WebGPU renders into a swapchain image created by chromium. It is then passed to a compositor in chromium. Both components are Vulkan running with their own VkInstance and VkDevice.
It looks like at some point
vkGetMemoryFdKHR()
returns a -1 file descriptor but I could be getting confused looking at the output.note: chromium might be doing graphics stuff in multiple processes, looking at the
gfxr-convert
output I think everything is in 1 process and captured but I'm not 100% sure.Verify before submission:
Build Environment: Please include the SHA and PR or branch name used in capture and also used to build the replayer.
1.3.290 SDK
To Reproduce Steps to reproduce the behavior:
.gfxr
file attached to the issue.Screenshots: Does not run long enough for screenshots.
System environment: Capture and replay on the same system running Ubuntu 24.04 with the RADV driver
Title configuration:
life
branch of https://github.com/jeremyg-lunarg/webgpu-electronWith npm and node.js installed:
npm install
npm run start
Additional information (optional):