Open immortalsalomon opened 8 months ago
Thanks very much for the bug report! I believe you've found the root cause for this lingering issue that triggers this validation error that I hadn't been able to reproduce very easily. I'll take a look and investigate!
Truly happy to help. You have done an extraordinary job :).
I found another trigger that generates the same validation error in Vulkan. The cause is the value of actual_alignment and the value of properties.nearest_multiple. I am testing my code on two graphics cards. One is an NVidia where the actual_alignment is 16 Bytes and the other is an intel integrated graphics card where the actual_alignment is 64 Bytes. With the former the error is not thrown but with the latter it is. This is because vulkan's vkGetBufferMemoryRequirements() method returns the size value which is a near multiple of 64 and not 32.
1° => 63797472 = conform_size(0, 63797448, 64, 32); 2° => 63797504 = conform_size(0, 63797448, 64, 64);
I think there are two possible solutions. The quickest is to change the value of nearest_multiple from 32 to 64 Bytes. The other would be to obtain the actual_alignment during initialisation and adjust the value of properties.nearest_multiple.
Was closing this intentional? Seems like it's still active from the comments so far.
Ahh, thanks so much for the detailed report! I've spent some time working on a fix and adding more test coverage. For the device specific alignment and vkGetBufferMemoryRequirements, I agree we should be querying the value during initialization. I'll add that as well.
@immortalsalomon I created a new pull request #8087 which should resolve these issues. Please give this a try!
@immortalsalomon I created a new pull request https://github.com/halide/Halide/pull/8087 which should resolve these issues. Please give this a try!
I tried the fixes and they work for the NVidia graphics card (16bytes alignment). For the intel graphics card (64bytes alignment) the fixes created another problem. below is the log:
Hmm, strange! I'll try and set thing up on my laptop with an integrated Intel GPU and see if I can diagnose what's going on. Which graphics card are you using?
Hmm, strange! I'll try and set thing up on my laptop with an integrated Intel GPU and see if I can diagnose what's going on. Which graphics card are you using?
Sorry I was in a bit of a hurry yesterday. Unfortunately I don't have time at the moment to check the problem myself. I found a temporary solution by setting the value of the nearest_multiple parameter to 64Bytes.
The specifications of my computer: CPU: 12th Gen Intel(R) Core(TM) i7-12850HX RAM: 32GB Integrated GPU: Intel(R) UHD Graphics Dedicated GPU: NVIDIA RTX A2000 8GB Laptop GPU
Thank you for investigating. :)
@immortalsalomon Thanks for your patience. Please give https://github.com/halide/Halide/pull/8130 a try when you have a chance!
Hi all, I think I have found a bug in the region_allocator used by Vulkan.
In short, the bug is created by a call to the can_split(const BlockRegion *block_region, size_t size) method which, instead of checking the size returned by the conform_size(size_t offset, size_t size, size_t alignment, size_t nearest_multiple) method, uses the size set in the MemoryRequest. This can cause a problem if the free region returned by the method find_block_region has the same size as the value returned by the conform_size method. Since the request->size value is used this case is not recognised and leads to the creation of a memory region with a theoretical size of zero. When a new memory region is created, the conform_size method is used to determine its size. The method will not return 0 but the nearest multiple of the properties.nearest_multiple value starting from the actual_alignment value passed to the method. This happens do to this check.
This creates the problem that if all regions in a block are free and the coalesce_block_regions method is called, the size of the resulting region exceeds the size of the block.
This also leads to the following error:
The fix I made is as follows. I have not tested it yet. Original:
Fixed: