Open AntarticCoder opened 1 year ago
If it's any help, consolidating some information
In vulkan I have examples of AABB accels here, Triangular geometry accels here, and Instance accels here
These code snippets include building and rebuilding trees, refitting and compaction for each of the types, include things like alignment for the scratch space of the acceleration structure build, etc...
I'm missing some features like storing and loading acceleration structures from memory, but that could be added if could benefit from more reference material.
It appears that these types roughly translate to MTLAccelerationStructureBoundingBoxGeometryDescriptor, MTLAccelerationStructureTriangleGeometryDescriptor, and MTLInstanceAccelerationStructureDescriptor
https://developer.apple.com/documentation/metal/mtlaccelerationstructure
One place that might be a good starting point is populating the structures of data for acceleration structure features and properties. In my code I do that here
In VkPhysicalDeviceAccelerationStructureFeaturesKHR
, we could probably just return true for the "accelerationStructure" field, and false for all the other fields.
In VkPhysicalDeviceAccelerationStructurePropertiesKHR
, we'd need to somehow figure out various limits imposed by metal ray tracing (max geometry count, instance and primitive count, minimum accel scratch offset alignment, etc)... I'm not sure how these are queried in Metal RT tbh
Looking at VkAccelerationStructureGeometryDataKHR
and MTLAccelerationStructureTriangleGeometryDescriptor
seem to be almost identical with a few minor differences.
One I noticed was the in the index type for the geometry descriptor, Vulkan allows you to simply pass in no indices along with the standard uint16 and uint32, however Metal does not seem to have an option of none within their index type struct.
@natevm I'm not sure if I'm looking in the wrong place however this link to the metal documentation seems to tell us the max count for some of these properties in standard and extended mode, iiuc.
https://developer.apple.com/documentation/metal/mtlaccelerationstructureusage/3750490-extendedlimits
@natevm I'm not sure if I'm looking in the wrong place however this link to the metal documentation seems to tell us the max count for some of these properties in standard and extended mode, iiuc.
https://developer.apple.com/documentation/metal/mtlaccelerationstructureusage/3750490-extendedlimits
Nice find. Yep, those seem like what I had in mind.
So, we know the following,
// Provided by VK_KHR_acceleration_structure
typedef struct VkPhysicalDeviceAccelerationStructureFeaturesKHR {
VkStructureType sType;
void* pNext;
VkBool32 accelerationStructure; // true
VkBool32 accelerationStructureCaptureReplay; // false (for now)
VkBool32 accelerationStructureIndirectBuild; // false (for now)
VkBool32 accelerationStructureHostCommands; // false (for now)
VkBool32 descriptorBindingAccelerationStructureUpdateAfterBind; // false (for now)
} VkPhysicalDeviceAccelerationStructureFeaturesKHR;
// Provided by VK_KHR_acceleration_structure
typedef struct VkPhysicalDeviceAccelerationStructurePropertiesKHR {
VkStructureType sType;
void* pNext;
uint64_t maxGeometryCount; // "Geometries in primitive acceleration structure, (2^24 / 2^30)
uint64_t maxInstanceCount; // "Instances in instance acceleration structure", (2^24 / 2^30)
uint64_t maxPrimitiveCount; // "Primitives in primitive acceleration structure", (2^28 / 2^30)
uint32_t maxPerStageDescriptorAccelerationStructures; // ???
uint32_t maxPerStageDescriptorUpdateAfterBindAccelerationStructures; // ???
uint32_t maxDescriptorSetAccelerationStructures; // ???
uint32_t maxDescriptorSetUpdateAfterBindAccelerationStructures; // ???
uint32_t minAccelerationStructureScratchOffsetAlignment; // ???
} VkPhysicalDeviceAccelerationStructurePropertiesKHR;
Here there is a mention of an alignment derived from "the platform's buffer offset alignment". What I don't entirely know is how metal handles the idea of "scratch" memory for acceleration structure builds.
@rcaridade145 Thanks, I think MTLAccelerationStructureSizes.accelerationStructureSize
could be used for the vkGetAccelerationStructureBuildSizesKHR
function which provides the expected acceleration structure size.
@natevm https://developer.apple.com/documentation/metal/mtlaccelerationstructuresizes/3553967-accelerationstructuresize and https://developer.apple.com/videos/play/wwdc2023/10128/?time=564 are of interest to you?
ah yeah, the "buildScratchBufferSize" in that first link was one of the things I was wondering about. Still not sure what the "minAccelerationStructureScratchOffsetAlignment" should be for that buffer, @rcaridade145 do you know what minimum offset alignment rules there might be?
Just a small note about Metal BLAS: Documentation about MTL::AccelerationStructureTriangleGeometryDescriptor::setIndexBufferOffset says:
Specify an offset that is a multiple of the index data type size and a multiple of the platform’s buffer offset alignment.
In feature table https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf,
buffer offset alignment
ranges from 4 bytes to 32 bytes (Mac2).
In Vulkan primitiveOffset must be multiple of component size.
Just a small note about Metal BLAS: Documentation about MTL::AccelerationStructureTriangleGeometryDescriptor::setIndexBufferOffset says:
Specify an offset that is a multiple of the index data type size and a multiple of the platform’s buffer offset alignment.
In feature table https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf,
buffer offset alignment
ranges from 4 bytes to 32 bytes (Mac2). In Vulkan primitiveOffset must be multiple of component size.
do you know if there is a pragmatic way to query this buffer offset alignment?
do you know if there is a pragmatic way to query this buffer offset alignment?
Oh, I wish to know, but doesn't seem to be any
@natevm https://developer.apple.com/documentation/metal/mtlaccelerationstructuresizes/3553967-accelerationstructuresize and https://developer.apple.com/videos/play/wwdc2023/10128/?time=564 are of interest to you?
ah yeah, the "buildScratchBufferSize" in that first link was one of the things I was wondering about. Still not sure what the "minAccelerationStructureScratchOffsetAlignment" should be for that buffer, @rcaridade145 do you know what minimum offset alignment rules there might be?
Not really. All the info i could find was
https://github.com/MetalKit/metal/blob/master/raytracing/Renderer.swift
It seems to use alignedUniformsSize .
https://gist.github.com/ctreffs/1cf72cd0d5e23d77fe55a011ea01a153
Is it possible to get a scratch buffer from it's device address? Looking at the Metal API documentation, there's basically nothing on device addresses, except for a single property on the MTLBuffer
. I know NVIDIA used to pass in a VkBuffer directly but now we have to use device addresses.
Will this help @AntarticCoder https://developer.apple.com/documentation/metal/mtlbuffer/1515716-contents ?
I believe I saw this during my research, but I probably didn't read the docs properly. I'll try it out later. Thanks @rcaridade145
The problem here is that afaik the scratch buffer is handled by Metal itself so perhaps you cannot use the contents function only with a custom buffer?
@AntarticCoder @rcaridade145 The contents
function will just give you a CPU pointer to the data of a shared
buffer. That's not useful here unless you want to copy all the data around on the CPU every time. (which would also involve a GPU sync)
What you have to do is basically maintain a map that maps BDA VAs to their original buffer objects. Keep in mind that this VA map has to be extremely fast and should minimize locking as much as possible. An example for that can be found in vkd3d-Proton: https://github.com/HansKristian-Work/vkd3d-proton/blob/master/libs/vkd3d/va_map.c
@K0bin This looks quite interesting, I'll see if i can get an efficient map working later.
do you know if there is a pragmatic way to query this buffer offset alignment?
Check MVKPhysicalDeviceMetalFeatures::mtlBufferAlignment
.
With iPhone 15 now having native hardware ray tracing support, I am guessing M3 is soon to follow suit. @AntarticCoder what's the status on this PR? Any blocking issues we should know about?
@natevm The only real blocking issue is how accelerations are handled in gpu memory because we have copy commands and noncommand copies. The solution seems to MTLHeaps accoring to a commenter on the PR. As for the status, I've been a bit busy with personal matters, but I've definitely wanted to get back into this. I could probably continue working next week. Thanks
@AntarticCoder totally understand. I’ll check out the MTLHeaps proposal on the PR.
I don’t suppose you have a discord where we could stay in touch, do you? Over there my username’s @natemorrical. We have a little Vulkan raytracing research group there that acts a bit like a slack space. If not, no worries, but figured I’d ask just in case :)
@natevm I just send a friend request. My username is Noble 6 the Penguin. 😀
At the moment, MoltenVK does not support raytracing(#427), and to support
VK_KHR_ray_tracing_pipeline
andVK_KHR_ray_query
, we need to implement acceleration structures. PR #1954 (issue #1953) implementedVK_KHR_deferred_host_operations
that finishes off the dependencies forVK_KHR_acceleration_structures
. The only thing left to do is to actually implement it. This issue will provide a place to discuss the design decisions for acceleration structures.I'm also planning on trying to implement this myself.