Closed hakolao closed 2 years ago
cc @gurchetansingh, you might know about this having worked on the PR
Can you simply your test app further?
DeviceMemoryBuilder::new(device.clone(), memory_type.id(), size).export_info(handle_type).build()?;
could also exhibit the problem
I'll try to get a moment to debug further soon!
Thank you @hakolao! This is mostly like pointer thrashing.
gdb --args ./your_app
break ptr_chain_iter
could be chance to improve ur gdb skills.
Findings with clion's debugger (played with gdb too, but clion's UI won)
I think that as we reinterpret the BaseStructure
inside push_next
wrong. Its data was sometimes wrong (E.g. size of memory in place of s_type
and also it would be safer to #[repr(C)]
the BaseOutStructure
.
Also, it would be nice to copy paste the comments from ash when shamelessly
copying internal functions from there :D.
After a while of attempts the PR below seemed to work.
Our exported & cuda imported memory also worked with this fix :)
hey all - I'm running into a segfault that I believe to be related to this issue. I've tried my best to whittle a reproduction down to the minimal amount of code here.
I am running this against the latest master by using cargo's [patch.crates-io]
override feature, using a local copy of this repo that is up to date with master, including the change from 7db41e5
.
the last bit of output before the segfault:
VUID-VkMemoryAllocateInfo-pNext-pNext(ERROR / SPEC): msgNum: -633955574 - Validation Error: [ VUID-VkMemoryAllocateInfo-pNext-pNext ] Object 0: handle = 0x7fcdac848418, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0xda369b0a | vkAllocateMemory: pAllocateInfo->pNext chain includes a structure with unknown VkStructureType (-1409041408); Allowed structures are [VkDedicatedAllocationMemoryAllocateInfoNV, VkExportMemoryAllocateInfo, VkExportMemoryAllocateInfoNV, VkExportMemoryWin32HandleInfoKHR, VkExportMemoryWin32HandleInfoNV, VkImportAndroidHardwareBufferInfoANDROID, VkImportMemoryFdInfoKHR, VkImportMemoryHostPointerInfoEXT, VkImportMemoryWin32HandleInfoKHR, VkImportMemoryWin32HandleInfoNV, VkMemoryAllocateFlagsInfo, VkMemoryDedicatedAllocateInfo, VkMemoryOpaqueCaptureAddressAllocateInfo, VkMemoryPriorityAllocateInfoEXT]. This error is based on the Valid Usage documentation for version 170 of the Vulkan header. It is possible that you are using a struct from a private extension or an extension that was added to a later version of the Vulkan header, in which case the use of pAllocateInfo->pNext is undefined and may not work correctly with validation enabled The Vulkan spec states: Each pNext member of any structure (including this one) in the pNext chain must be either NULL or a pointer to a valid instance of VkDedicatedAllocationMemoryAllocateInfoNV, VkExportMemoryAllocateInfo, VkExportMemoryAllocateInfoNV, VkExportMemoryWin32HandleInfoKHR, VkExportMemoryWin32HandleInfoNV, VkImportAndroidHardwareBufferInfoANDROID, VkImportMemoryFdInfoKHR, VkImportMemoryHostPointerInfoEXT, VkImportMemoryWin32HandleInfoKHR, VkImportMemoryWin32HandleInfoNV, VkMemoryAllocateFlagsInfo, VkMemoryDedicatedAllocateInfo, VkMemoryOpaqueCaptureAddressAllocateInfo, or VkMemoryPriorityAllocateInfoEXT (https://vulkan.lunarg.com/doc/view/1.2.170.0/mac/1.2-extensions/vkspec.html#VUID-VkMemoryAllocateInfo-pNext-pNext)
Objects: 1
[0] 0x7fcdac848418, type: 3, name: NULL
zsh: segmentation fault cargo run --release
the only thing jumps out to me is the part about the sType
of some pNext
pointer:
vkAllocateMemory: pAllocateInfo->pNext chain includes a structure with unknown VkStructureType (-1409041408);
I do not have much experience debugging memory issues (including using gdb
) yet, so I'm not sure what to look for. I also was pretty stumped when the teapot
example on my local clone of this repo runs fine, since it is also using AttachmentImage::transient
, which appears to be the call that causes the segfault. All this is to say that I don't have much confidence there even is an issue here, or if this is somehow isolated to my machine. I believe that cloning the repo linked above from another machine would help answer that quickly. I can see if I can find one to use in the meantime.
os | macos v10.15.7 |
cargo --version |
cargo 1.51.0 (43b129a20 2021-03-16) |
If there is anything I can do to help or more information I can provide, I would be happy to help. I really appreciate all the work going into this project.
Did you try top of tree with the fix?
I could've sworn I was. I've changed the override to use a git
resolution:
[patch.crates-io]
vulkano = { git = 'https://github.com/vulkano-rs/vulkano', rev = 'fb187e888469ac2255855ffa8b7022258a688222' }
instead of using { path = '...' }
. now I'm only seeing the segfault when running with --release
(cargo run --release
).
@dadleyy I tried vulkano-seg-1545
, but did not have any segmentation fault.
smithay_client_toolkit::window::concept_frame] No font could be found
Maybe it's an architecture specific issue? Also I changed this for it to compile:
[dependencies]
-vulkano = { version = "0.22" }
-vulkano-win = { version = "0.22" }
+vulkano = { path = "../vulkano/vulkano" }
+vulkano-win = { path = "../vulkano/vulkano-win" }
winit = { version = "0.24.0" }
env_logger = { version = "0.8.3" }
log = { version = "^0.4.1" }
-
-[patch.crates-io]
-vulkano = { git = 'https://github.com/vulkano-rs/vulkano', rev = 'fb187e888469ac2255855ffa8b7022258a688222' }
@gurchetansingh what architecture/environment are you running on? I was only experiencing this when running --release
builds, did you try that?
smithay_client_toolkit::window::concept_frame] No font could be found
I'm confused - did you mean to include this in your comment? was that output from running vulkano-seg-1545?
I'm still experiencing this issue, but I have been slowly poking away at what I might think is the issue. I just finished implementing some changes that appear to fix this issue on this branch of my fork, though I'm not 100% certain why it would be a solution.
I'm thinking this has to do with the memory pointed to when using the push_next
+ ptr_chain_iter
being invalidated somehow - which explains why the sType
ends up as garbage like -1409041408
: if we push_next(&mut T)
and then the location of T
changes at some point, the pNext
pointer on our MemoryAllocateInfo
would be invalid right?
I'm definitely still learning rust so this is all just me spitballing.
I also attempted this with the --release
just now, and it segfaulted :S. Looks like the fix wasn't enough indeed.
Works without --release.
A better fix attempt... #1580. Hopefully this would fix it once and for all. Rust can sometimes make ptr play so hard...
It kind of baffles me why this would work in non-release mode and fail in release mode.
I'm seeing this on head today, except with AttachmentImage::new_with_exportable_fd
-- and again, it crashes with a --release
build but not a debug build.
The above issue I think is caused by this, and separate to the original issue as reported, sorry: https://github.com/vulkano-rs/vulkano/pull/1734/files#r771974391
The dropped temporary is fixed now, so if that was the problem here, it should be fixed too. I'll reopen if it's still present.
After updating to 0.22 I get segfault when creating a
DeviceLocalBuffer::raw_with_exportable_fd
.460.32.03
main.rs
file that demonstrates the issue: TODOIssue
Segfault after #1510. Segfaulting at:
ptr_chain_iter
It seems to be related to the allocation size somehow. I don't have much clue what the changes are for in above linked PR that broke the functionality for me, but just reporting this is happening. 34mb size seems to be enough to trigger this.
I'll use the previous versions meanwhile.
main.rs: