Closed antiagainst closed 3 months ago
Okay all tests are passing so this is ready for reviews now..
This appears to have broken some benchmark builds: https://github.com/iree-org/iree/actions/runs/9572945480/job/26393626504#step:7:706
FAILED: e2e_test_artifacts/iree_module_MobileBertSquad_int8_tflite___arm-valhall-vulkan_android31-vulkan_spirv__default-flags_compile-stats_/module.vmfb /work/build-e2e-test-artifacts/e2e_test_artifacts/iree_module_MobileBertSquad_int8_tflite___arm-valhall-vulkan_android31-vulkan_spirv__default-flags_compile-stats_/module.vmfb
cd /work/build-e2e-test-artifacts/tests/e2e/test_artifacts && /work/full-build-dir/install/bin/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=vulkan-spirv --iree-input-type=tosa --iree-vulkan-target-triple=valhall-unknown-android31 --iree-vm-emit-polyglot-zip=true --iree-llvmcpu-debug-symbols=false --iree-scheduling-dump-statistics-format=json --iree-scheduling-dump-statistics-file=/work/build-e2e-test-artifacts/e2e_test_artifacts/iree_module_MobileBertSquad_int8_tflite___arm-valhall-vulkan_android31-vulkan_spirv__default-flags_compile-stats_/scheduling_stats.json /work/build-e2e-test-artifacts/e2e_test_artifacts/iree_MobileBertSquad_int8_tflite_.mlir -o /work/build-e2e-test-artifacts/e2e_test_artifacts/iree_module_MobileBertSquad_int8_tflite___arm-valhall-vulkan_android31-vulkan_spirv__default-flags_compile-stats_/module.vmfb --iree-hal-executable-object-search-path=\"/work/build-e2e-test-artifacts\"
failed to translate executables
failed to translate executables
failed to translate executables
failed to translate executables
failed to translate executables
failed to translate executables
failed to translate executables
failed to translate executables
failed to translate executables
<unknown>:0: error: loc(callsite("model/bert_span_labeler/mobile_bert_encoder/mobile_bert_embedding/embedding_norm/add" at "main")): 'spirv.VectorShuffle' op result #0 must be vector of bool or 8/16/32/64-bit integer or 16/32/64-bit float values of length 2/3/4/8/16, but got 'i32'
<unknown>:0: note: loc("main"): called from
<unknown>:0: error: loc(callsite("model/bert_span_labeler/mobile_bert_encoder/mobile_bert_embedding/embedding_norm/add" at "main")): failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan-spirv", "vulkan-spirv-fb", {iree.gpu.target = #iree_gpu.target<arch = "valhall1", features = "spirv:v1.4,cap:Shader", wgp = <compute = fp32|fp16|int64|int32|int16|int8, storage = b64|b32|b16|b8, subgroup = shuffle|arithmetic, dot = dp4xi8toi32, mma = [], subgroup_size_choices = [16, 0], max_workgroup_sizes = [512, 512, 512], max_thread_count_per_workgroup = 512, max_workgroup_memory_bytes = 32768>>, spirv.target_env = #spirv.target_env<#spirv.vce<v1.4, [Shader, Float16, Int64, Int16, Int8, StorageBuffer16BitAccess, StorageUniform16, StoragePushConstant16, StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, StoragePushConstant8, GroupNonUniformShuffle, GroupNonUniformShuffleRelative, GroupNonUniformArithmetic, DotProduct, DotProductInput4x8BitPacked, DotProductInputAll, DotProductInput4x8Bit], [SPV_KHR_16bit_storage, SPV_KHR_8bit_storage, SPV_KHR_integer_dot_product]>, #spirv.resource_limits<max_compute_shared_memory_size = 32768, max_compute_workgroup_invocations = 512, max_compute_workgroup_size = [512 : i32, 512 : i32, 512 : i32], subgroup_size = 16, min_subgroup_size = 0, max_subgroup_size = 16, cooperative_matrix_properties_khr = []>>}>
This commit switches SPIR-V side to use the common
#iree_gpu.target
to describe the GPU characteristics. With it we can now remove the ad-hoc Vulkan attributes and dialects and unify how GPU are described across various GPU compiler backends in IREE.SPIR-V
SPIR-V has some additional requirements that we need to account for:
We have many vendors and APIs to handle there so this commit adds various AMD/ARM/NVIDIA/Qualcomm targets for development purposes so that we can specify them with a shorthand.
In order to be extensible, leverage the
feature
field in#iree_gpu.target
to specify additional capabilities withcap:
prefix and extensions withext:
prefix. We also use thefeature
field to specify what SPIR-V version to target with thespirv:v1.x
format.Right now the
SPIRVConvertGPUTarget
pass is invoked immediately before configuration. This is to stage the changes. As a next step we need to move it immediately beforeConvertToSPIRV
pass.Vulkan
--iree-vulkan-target-env
is dropped given now we removed the whole Vulkan dialect and cannot control with a#vk.target_env
attribute anymore.The default
--iree-vulkan-target-triple
now becomesvp_android_baseline_2022
, which is a a good lowest common denominator to guarantee the generated SPIR-V is widely accepted. We are not considering SwiftShader now anymore like previously due to testing purposes.The
--iree-vulkan-target-triple
should be renamed given it's not a triple anymore--that will happen later together with other GPU backends (i.e., cuda/hip) to be consistent.In order to support cooperative matrix conversion, we added
WMMA_F16_16x16x16_F16
. For NVIDIA GPUs we are abusing it right now without considering the concrete explicit layout--that is fine given in Vulkan they are opaque anyway. But this need to be fixed if we are targeting WMMA in CUDA.Metal / WebGPU
We now contruct a
#iree_gpu.target
to specify the target to drive SPIR-V CodeGen.Progress towards https://github.com/iree-org/iree/issues/16341
ci-extra: test_nvidia_gpu,test_nvidia_a100,test_amd_mi250, build_test_all_macos_arm64,build_and_test_android