Open monorimet opened 1 year ago
Attaching the vulkaninfo for this machine (I left out the report for the RTX 4090 so it's less confusing) vulkaninfo.txt
This is not the HAL allocator using the wrong memory type but the buffer having the wrong declared usage. For what purpose are you allocating a 26GB buffer?
This is not the HAL allocator using the wrong memory type but the buffer having the wrong declared usage. For what purpose are you allocating a 26GB buffer?
Dispatch storage iiuc. This is loading in dispatches for the Vicuña model.
The only buffers we should be allocating host local/device visible are staging buffers and the only ones host visible/device local are external buffers (results from invocations, today) - all others (constants, variables, and transient memory) should be device local only. You can compile to the stream dialect and see if there's anything that stands out with your resource (iree-compile --compile-to=stream
). If you can share a dump of that here I can see if I can spot anything obvious.
What happened?
With a device whose VRAM and memory heaps should be plenty sufficient for this allocation:
The resource allocation issue seems to be happening because the HAL allocator gets the wrong memory type (DEVICE_VISIBLE over DEVICE_LOCAL) when trying to set up the buffer allocation.
Here I show that the 26GB buffer can allocate successfully if the DEVICE_LOCAL memory type is explicitly specified:
I put a few return_on_error prints in iree/hal/drivers/vulkan/vma_allocator.cc to see which memory type is specified for the failing allocation and it is always DEVICE_VISIBLE for w7900 in the stack below ireert.VmModule.from_flatbuffer() where we load the .vmfb into device memory in SHARK.
Can this be explicitly changed on the python level or will it require refinement of the vma_allocator memory type populating? Can we specify the memory type or have it inferred from a compile-time option? I tried getting the right case to happen in StreamToHAL by compiling the .vmfb with
--iree-execution-model=async-external
but didn't see an effect on the inferred memory type.Steps to reproduce your issue
No response
What component(s) does this issue relate to?
Runtime
Version information
5b89a14
Additional context
No response