Open Abhishek-Varma opened 7 months ago
Issue A is because you are passing the input wrong. Instead of --input=2.0
it should be --input=f16=2.0
.
Hi @nithinsubbiah, any updates on this issue? I'm also getting the same error.
Found ROCm device arch : gfx90a
Saved vmfb in /app/phaneesh/SHARK/falcon_180b_layer_0_20_int4_rocm.vmfb.
Saved falcon vmfb at /app/phaneesh/SHARK/falcon_180b_layer_0_20_int4_rocm.vmfb
Loading module /app/phaneesh/SHARK/falcon_180b_layer_0_20_int4_rocm.vmfb...
Traceback (most recent call last):
File "/app/phaneesh/SHARK/apps/language_models/src/pipelines/falcon_pipeline.py", line 1097, in <module>
falcon = ShardedFalcon(
^^^^^^^^^^^^^^
File "/app/phaneesh/SHARK/apps/language_models/src/pipelines/falcon_pipeline.py", line 145, in __init__
self.shark_model = self.compile()
^^^^^^^^^^^^^^
File "/app/phaneesh/SHARK/apps/language_models/src/pipelines/falcon_pipeline.py", line 386, in compile
shark_module, device_idx = self.compile_layer(
^^^^^^^^^^^^^^^^^^^
File "/app/phaneesh/SHARK/apps/language_models/src/pipelines/falcon_pipeline.py", line 295, in compile_layer
shark_module.load_module(path)
File "/app/phaneesh/SHARK/shark/shark_inference.py", line 232, in load_module
params = load_flatbuffer(
^^^^^^^^^^^^^^^^
File "/app/phaneesh/SHARK/shark/iree_utils/compile_utils.py", line 519, in load_flatbuffer
vmfb, config, temp_file_to_unlink = load_vmfb_using_mmap(
^^^^^^^^^^^^^^^^^^^^^
File "/app/phaneesh/SHARK/shark/iree_utils/compile_utils.py", line 450, in load_vmfb_using_mmap
ctx.add_vm_module(mmaped_vmfb)
File "/app/phaneesh/SHARK/shark.venv/lib/python3.11/site-packages/iree/runtime/system_api.py", line 271, in add_vm_module
self.add_vm_modules((vm_module,))
File "/app/phaneesh/SHARK/shark.venv/lib/python3.11/site-packages/iree/runtime/system_api.py", line 268, in add_vm_modules
self._vm_context.register_modules(vm_modules)
RuntimeError: Error registering modules: c/experimental/rocm/status_util.c:31: INTERNAL; rocm driver error 'hipErrorOutOfMemory' (2): out of memory; failed to allocate buffer of length 4025827328; while invoking native function hal.allocator.allocate; while calling import;
[ 1] native hal.allocator.allocate:0 -
[ 0] bytecode module@1:4294 -
Hi Vivek, I'm still working on this. The error seems to be because the device is actually running out of memory and not a ROCm HAL driver error which would mean we need to shard/quantize the model. I'll investigate further and let you know.
Could you please share the IR that fails?
Hi Vivek, I'm still working on this. The error seems to be because the device is actually running out of memory and not a ROCm HAL driver error which would mean we need to shard/quantize the model. I'll investigate further and let you know.
Could you please share the IR that fails?
Yeah, you're correct. The error was actually because of OOM. Some other processes were running that I didn't know of. Anyway, thanks!
@Abhishek-Varma I am able to run SDXL on 6010 (gfx90
) successfully. I looked at the trace, dispatches and couldn't find anything offending. It's likely that multiple processes were running at the same time causing the OOM error. Please let me know if it works for you.
What happened?
There are 2 issues to resolve here for the input IR : unet_1_77_1024_1024_fp16_stable-diffusion-xl-base-1.mlir
Compiling on ROCM Linux
gfx90
(--iree-flow-break-dispatch=forward_dispatch323):Run using the following:
Section A
Causes the following issue when usingiree-run-module
:-323
:323
:Section B
Causes the following issue when using a Python script to invoke theforward
function :-323
: NO ISSUE.323
(Same asSection A
):So, effectively, there'd be two issues that needs to be resolved as captured entirely by
Section A
itself.Steps to reproduce your issue
For
Section A
issues above, the compilation command and the run command given earlier would be needed.For
Section B
issue above, the following script would be required :-In case the elided IR is needed : unet_elided.mlir
I'm also attaching dispatch 323 here : dispatch_323.mlir
What component(s) does this issue relate to?
No response
Version information
I'm using the following SRT version :
Additional context
No response