nod-ai / SHARK-Studio

SHARK Studio -- Web UI for SHARK+IREE High Performance Machine Learning Distribution
Apache License 2.0
1.41k stars 170 forks source link

🚀 Request to add RX 500 Series (AMD_RGCNv3) to 👻IREE Target-Triple #532

Open averad opened 1 year ago

averad commented 1 year ago

🚀 Enhancement Request - Add RX 500 Series (AMD_RGCNv3) to👻IREE

User Story

When attempting to run SHARK Runtime Windows Build the following error occurs.

Optimized kernel for your target device is not added yet.
            Contact SHARK Admin on discord[https://discord.com/invite/RUqY2h2s9u]
            or pull up an issue.
Target : deviceName        = Radeon RX 580 Series

Enhancement Request

Please add the RX 500 Series of cards to IREE Target-Triple so they can be utilized without manually building IREE.

Vulkan Info Output:

vulkan_info_out.txt

Untested Code Change

https://github.com/iree-org/iree/compare/main...averad:iree:rx500_series-TargetTripple

I am working on testing the above code.

averad commented 1 year ago

Opened upstream Enhancement Request #11417

averad commented 1 year ago

Not sure if this is helpful but this is the error I am getting now after compiling and including the Target-Triple

python .\shark\examples\shark_inference\stable_diffusion\main.py --precision="fp16" --prompt="tajmahal, snow, sunflowers, oil on canvas" --device="vulkan" --iree-vulkan-target-triple=rgcn3-unknown-windows

(shark.venv) D:\SHARK>python .\shark\examples\shark_inference\stable_diffusion\main.py --precision="fp16" --prompt="tajmahal, snow, sunflowers, oil on canvas" --device="vulkan" --iree-vulkan-target-triple=rgcn3-unknown-windows
shark_tank local cache is located at C:\Users\avera\.local/shark_tank/ . You may change this by setting the --local_tank_cache= flag
The models are present in the C:\Users\avera\.local/shark_tank/. If you want a fresh
                download, consider deleting the directory.
Copying gs://shark_tank/stable_diffusion/unet_1dec_fp16_torch/hash.npy...
/ [1/1 files][  640.0 B/  640.0 B] 100% Done
Operation completed over 1 objects/640.0 B.
No vmfb found. Compiling and saving to D:\SHARK\unet_1dec_fp16_vulkan.vmfb
Using target triple from command line args
Traceback (most recent call last):
  File "D:\SHARK\shark\examples\shark_inference\stable_diffusion\main.py", line 59, in <module>
    unet = get_unet()
  File "D:\SHARK\shark\examples\shark_inference\stable_diffusion\opt_params.py", line 40, in get_unet
    return get_shark_model(bucket, model_name, iree_flags)
  File "D:\SHARK\shark\examples\shark_inference\stable_diffusion\utils.py", line 50, in get_shark_model
    return _compile_module(shark_module, model_name, extra_args)
  File "D:\SHARK\shark\examples\shark_inference\stable_diffusion\utils.py", line 31, in _compile_module
    path = shark_module.save_module(
  File "D:\SHARK\shark\shark_inference.py", line 188, in save_module
    return export_iree_module_to_vmfb(
  File "D:\SHARK\shark\iree_utils\compile_utils.py", line 303, in export_iree_module_to_vmfb
    flatbuffer_blob = compile_module_to_flatbuffer(
  File "D:\SHARK\shark\iree_utils\compile_utils.py", line 246, in compile_module_to_flatbuffer
    flatbuffer_blob = ireec.compile_str(
  File "D:\SHARK\shark.venv\lib\site-packages\iree\compiler\tools\core.py", line 278, in compile_str
    result = invoke_immediate(cl, immediate_input=input_bytes)
  File "D:\SHARK\shark.venv\lib\site-packages\iree\compiler\tools\binaries.py", line 196, in invoke_immediate
    raise CompilerToolError(process)
iree.compiler.tools.binaries.CompilerToolError: Error invoking IREE compiler tool iree-compile.exe
Diagnostics:
xyz.str:809:11: error: failed to legalize operation 'memref.load'
xyz.str:26:3: note: called from
xyz.str:809:11: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.6, [Shader, Int64, Int16, Int8, StorageBuffer16BitAccess, StorageUniform16, StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, GroupNonUniform, GroupNonUniformVote, GroupNonUniformArithmetic, GroupNonUniformBallot, GroupNonUniformShuffle, GroupNonUniformShuffleRelative, GroupNonUniformClustered, GroupNonUniformQuad, VariablePointers, VariablePointersStorageBuffer], [SPV_KHR_16bit_storage, SPV_KHR_8bit_storage, SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, AMD:DiscreteGPU, #spirv.resource_limits<max_compute_shared_memory_size = 65536, max_compute_workgroup_invocations = 1024, max_compute_workgroup_size = [1024, 1024, 1024], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
xyz.str:26:3: note: called from
xyz.str:809:11: error: failed to serialize executables
xyz.str:26:3: note: called from
xyz.str:843:11: error: 'spirv.Store' op mismatch in result type and pointer type
xyz.str:26:3: note: called from
xyz.str:843:11: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.6, [Shader, Int64, Int16, Int8, StorageBuffer16BitAccess, StorageUniform16, StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, GroupNonUniform, GroupNonUniformVote, GroupNonUniformArithmetic, GroupNonUniformBallot, GroupNonUniformShuffle, GroupNonUniformShuffleRelative, GroupNonUniformClustered, GroupNonUniformQuad, VariablePointers, VariablePointersStorageBuffer], [SPV_KHR_16bit_storage, SPV_KHR_8bit_storage, SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, AMD:DiscreteGPU, #spirv.resource_limits<max_compute_shared_memory_size = 65536, max_compute_workgroup_invocations = 1024, max_compute_workgroup_size = [1024, 1024, 1024], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
xyz.str:26:3: note: called from
xyz.str:843:11: error: failed to serialize executables
xyz.str:26:3: note: called from
xyz.str:855:11: error: failed to legalize operation 'memref.load'
xyz.str:26:3: note: called from
xyz.str:855:11: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.6, [Shader, Int64, Int16, Int8, StorageBuffer16BitAccess, StorageUniform16, StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, GroupNonUniform, GroupNonUniformVote, GroupNonUniformArithmetic, GroupNonUniformBallot, GroupNonUniformShuffle, GroupNonUniformShuffleRelative, GroupNonUniformClustered, GroupNonUniformQuad, VariablePointers, VariablePointersStorageBuffer], [SPV_KHR_16bit_storage, SPV_KHR_8bit_storage, SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, AMD:DiscreteGPU, #spirv.resource_limits<max_compute_shared_memory_size = 65536, max_compute_workgroup_invocations = 1024, max_compute_workgroup_size = [1024, 1024, 1024], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
xyz.str:26:3: note: called from
xyz.str:855:11: error: failed to serialize executables
xyz.str:26:3: note: called from
xyz.str:869:11: error: failed to legalize operation 'memref.load'
xyz.str:26:3: note: called from
xyz.str:869:11: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.6, [Shader, Int64, Int16, Int8, StorageBuffer16BitAccess, StorageUniform16, StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, GroupNonUniform, GroupNonUniformVote, GroupNonUniformArithmetic, GroupNonUniformBallot, GroupNonUniformShuffle, GroupNonUniformShuffleRelative, GroupNonUniformClustered, GroupNonUniformQuad, VariablePointers, VariablePointersStorageBuffer], [SPV_KHR_16bit_storage, SPV_KHR_8bit_storage, SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, AMD:DiscreteGPU, #spirv.resource_limits<max_compute_shared_memory_size = 65536, max_compute_workgroup_invocations = 1024, max_compute_workgroup_size = [1024, 1024, 1024], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
xyz.str:26:3: note: called from
xyz.str:869:11: error: failed to serialize executables
xyz.str:26:3: note: called from
xyz.str:880:11: error: failed to legalize operation 'memref.load'
xyz.str:26:3: note: called from
xyz.str:880:11: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.6, [Shader, Int64, Int16, Int8, StorageBuffer16BitAccess, StorageUniform16, StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, GroupNonUniform, GroupNonUniformVote, GroupNonUniformArithmetic, GroupNonUniformBallot, GroupNonUniformShuffle, GroupNonUniformShuffleRelative, GroupNonUniformClustered, GroupNonUniformQuad, VariablePointers, VariablePointersStorageBuffer], [SPV_KHR_16bit_storage, SPV_KHR_8bit_storage, SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, AMD:DiscreteGPU, #spirv.resource_limits<max_compute_shared_memory_size = 65536, max_compute_workgroup_invocations = 1024, max_compute_workgroup_size = [1024, 1024, 1024], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
xyz.str:26:3: note: called from
xyz.str:880:11: error: failed to serialize executables
xyz.str:26:3: note: called from
xyz.str:881:11: error: failed to legalize operation 'memref.load'
xyz.str:26:3: note: called from
xyz.str:881:11: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.6, [Shader, Int64, Int16, Int8, StorageBuffer16BitAccess, StorageUniform16, StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, GroupNonUniform, GroupNonUniformVote, GroupNonUniformArithmetic, GroupNonUniformBallot, GroupNonUniformShuffle, GroupNonUniformShuffleRelative, GroupNonUniformClustered, GroupNonUniformQuad, VariablePointers, VariablePointersStorageBuffer], [SPV_KHR_16bit_storage, SPV_KHR_8bit_storage, SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, AMD:DiscreteGPU, #spirv.resource_limits<max_compute_shared_memory_size = 65536, max_compute_workgroup_invocations = 1024, max_compute_workgroup_size = [1024, 1024, 1024], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
xyz.str:26:3: note: called from
xyz.str:881:11: error: failed to serialize executables
xyz.str:26:3: note: called from
xyz.str:886:15: error: failed to legalize operation 'memref.load'
xyz.str:26:3: note: called from
xyz.str:886:15: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.6, [Shader, Int64, Int16, Int8, StorageBuffer16BitAccess, StorageUniform16, StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, GroupNonUniform, GroupNonUniformVote, GroupNonUniformArithmetic, GroupNonUniformBallot, GroupNonUniformShuffle, GroupNonUniformShuffleRelative, GroupNonUniformClustered, GroupNonUniformQuad, VariablePointers, VariablePointersStorageBuffer], [SPV_KHR_16bit_storage, SPV_KHR_8bit_storage, SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers]>, api=Vulkan, AMD:DiscreteGPU, #spirv.resource_limits<max_compute_shared_memory_size = 65536, max_compute_workgroup_invocations = 1024, max_compute_workgroup_size = [1024, 1024, 1024], subgroup_size = 64, cooperative_matrix_properties_nv = []>>}>
xyz.str:26:3: note: called from
xyz.str:886:15: error: failed to serialize executables
xyz.str:26:3: note: called from

Invoked with:
 iree-compile.exe D:\SHARK\shark.venv\lib\site-packages\iree\compiler\tools\..\_mlir_libs\iree-compile.exe - --iree-input-type=none --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=vulkan --iree-llvm-embedded-linker-path=D:\SHARK\shark.venv\lib\site-packages\iree\compiler\tools\..\_mlir_libs\iree-lld.exe --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvm-target-cpu-features=host --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs -iree-vulkan-target-triple=rgcn3-unknown-windows --iree-flow-enable-padding-linalg-ops --iree-flow-linalg-ops-padding-size=32 --iree-flow-enable-conv-img2col-transform

Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.
markanini commented 1 year ago

May I request RX4xx series too? They are a highly related series.

vleeuwenmenno commented 1 year ago

Is this fixed yet? Seems to run but with anything v3 I only get black images generated.