Open PhaneeshB opened 1 year ago
From some info on maxComputeWorkGroupCount
link
In an attempt to debug the Validation Error: [ VUID-vkCmdDispatch-groupCountY-00387 ]
tried adding the capability maxComputeWorkGroupSize = dense<[1024, 1024, 1024]>: vector<3xi32>,
to the target env string iree-compile crashes with the following stack dump (possibly due to maxComputeWorkGroupSize not being an expected entry to target env string) :
Stack dump:
0. Program arguments: ../IREE/iree-build/tools/iree-compile --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=vulkan --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host "--iree-vulkan-target-env=#vk.target_env<v1.3, r(120), [VK_KHR_16bit_storage, VK_KHR_8bit_storage, VK_KHR_shader_float16_int8, VK_KHR_spirv_1_4, VK_KHR_storage_buffer_storage_class, VK_KHR_variable_pointers, VK_EXT_subgroup_size_control, VK_NV_cooperative_matrix], NVIDIA:DiscreteGPU, #vk.caps< maxComputeSharedMemorySize = 49152, maxComputeWorkGroupInvocations = 1024, maxComputeWorkGroupSize = dense<[1024, 1024, 1024]>: vector<3xi32>, maxComputeWorkGroupCount = dense<[2147483647, 65535, 65535]>: vector<3xi32>, subgroupSize = 32, subgroupFeatures = 255: i32, minSubgroupSize = 32, maxSubgroupSize = 32, shaderFloat16 = unit, shaderFloat64 = unit, shaderInt8 = unit, shaderInt16 = unit, shaderInt64 = unit, storageBuffer16BitAccess = unit, storagePushConstant16 = unit, uniformAndStorageBuffer16BitAccess = unit, storageBuffer8BitAccess = unit, storagePushConstant8 = unit, uniformAndStorageBuffer8BitAccess = unit, variablePointers = unit, variablePointersStorageBuffer = unit, cooperativeMatrixPropertiesNV = [#vk.coop_matrix_props<mSize = 8, nSize = 8, kSize = 32, aType = i8, bType = i8, cType = i32, resultType = i32, scope = #vk.scope<Subgroup>>, #vk.coop_matrix_props<mSize = 16, nSize = 16, kSize = 16, aType = f16, bType = f16, cType = f16, resultType = f16, scope = #vk.scope<Subgroup>>, #vk.coop_matrix_props<mSize = 16, nSize = 16, kSize = 16, aType = f16, bType = f16, cType = f32, resultType = f32, scope = #vk.scope<Subgroup>>], shaderIntegerDotProduct = unit >>" --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-vm-target-truncate-unsupported-floats --iree-codegen-check-ir-before-llvm-conversion=false --iree-opt-const-expr-hoisting=False --iree-flow-dump-dispatch-graph=1 --iree-codegen-linalg-max-constant-fold-elements=9223372036854775807 /home/phaneesh/SHARK/llama2_7b_int8_new_cutfv_upgrade_sub.mlir -o ./vmfbs/cut_at42_upsub_maxworkgroup_llama2_7b_int8.vmfb
#0 0x00007f902ea3d6f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/phaneesh/IREE/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:602:13
#1 0x00007f902ea3bb00 llvm::sys::RunSignalHandlers() /home/phaneesh/IREE/iree/third_party/llvm-project/llvm/lib/Support/Signals.cpp:105:18
#2 0x00007f902ea3dd8a SignalHandler(int) /home/phaneesh/IREE/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
#3 0x00007f9029642520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
#4 0x00007f9029696a7c pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x96a7c)
#5 0x00007f9029642476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476)
#6 0x00007f90296287f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
#7 0x00007f902962871b (/lib/x86_64-linux-gnu/libc.so.6+0x2871b)
#8 0x00007f9029639e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#9 0x00007f902ea84cb4 mlir::NamedAttribute::NamedAttribute(mlir::StringAttr, mlir::Attribute) /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/lib/IR/Attributes.cpp:46:3
#10 0x00007f902fbd355c mlir::NamedAttribute& llvm::SmallVectorImpl<mlir::NamedAttribute>::emplace_back<mlir::StringAttr, mlir::spirv::TargetEnvAttr&>(mlir::StringAttr&&, mlir::spirv::TargetEnvAttr&) /home/phaneesh/IREE/iree/third_party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:0:0
#11 0x00007f902fbd355c mlir::iree_compiler::IREE::HAL::VulkanSPIRVTargetBackend::getExecutableTarget(mlir::MLIRContext*, mlir::spirv::TargetEnvAttr) const /home/phaneesh/IREE/iree/compiler/src/iree/compiler/Dialect/HAL/Target/VulkanSPIRV/VulkanSPIRVTarget.cpp:307:17
#12 0x00007f902fbd33a8 mlir::iree_compiler::IREE::HAL::VulkanSPIRVTargetBackend::getExecutableTargets(mlir::MLIRContext*) const /home/phaneesh/IREE/iree/compiler/src/iree/compiler/Dialect/HAL/Target/VulkanSPIRV/VulkanSPIRVTarget.cpp:295:27
#13 0x00007f902fbd240f llvm::SmallVectorBase<unsigned int>::size() const /home/phaneesh/IREE/iree/third_party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:91:32
#14 0x00007f902fbd240f mlir::NamedAttribute& llvm::SmallVectorImpl<mlir::NamedAttribute>::emplace_back<mlir::StringAttr, mlir::ArrayAttr>(mlir::StringAttr&&, mlir::ArrayAttr&&) /home/phaneesh/IREE/iree/third_party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:942:9
#15 0x00007f902fbd240f mlir::iree_compiler::IREE::HAL::VulkanSPIRVTargetBackend::getDefaultDeviceTarget(mlir::MLIRContext*) const /home/phaneesh/IREE/iree/compiler/src/iree/compiler/Dialect/HAL/Target/VulkanSPIRV/VulkanSPIRVTarget.cpp:110:17
#16 0x00007f902f9e5a2d mlir::iree_compiler::IREE::HAL::AssignTargetDevicesPass::runOnOperation() /home/phaneesh/IREE/iree/compiler/src/iree/compiler/Dialect/HAL/Transforms/AssignTargetDevices.cpp:101:26
#17 0x00007f902ebc95a5 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:0:17
#18 0x00007f902ebc95a5 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /home/phaneesh/IREE/iree/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#19 0x00007f902ebc95a5 llvm::function_ref<void ()>::operator()() const /home/phaneesh/IREE/iree/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#20 0x00007f902ebc95a5 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#21 0x00007f902ebc95a5 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:479:21
#22 0x00007f902ebc9d28 mlir::LogicalResult::failed() const /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#23 0x00007f902ebc9d28 mlir::failed(mlir::LogicalResult) /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#24 0x00007f902ebc9d28 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:551:9
#25 0x00007f902ebcc09b mlir::PassManager::run(mlir::Operation*) /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:0:0
#26 0x00007f902e999d12 mlir::LogicalResult::failed() const /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#27 0x00007f902e999d12 mlir::failed(mlir::LogicalResult) /home/phaneesh/IREE/iree/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#28 0x00007f902e999d12 mlir::iree_compiler::embed::(anonymous namespace)::Invocation::runPipeline(iree_compiler_pipeline_t) /home/phaneesh/IREE/iree/compiler/src/iree/compiler/API/Internal/Embed.cpp:788:7
#29 0x00007f902e999d12 ireeCompilerInvocationPipeline /home/phaneesh/IREE/iree/compiler/src/iree/compiler/API/Internal/Embed.cpp:1216:23
#30 0x00007f902eb95131 mlir::iree_compiler::runIreecMain(int, char**)::$_4::operator()(iree_compiler_source_t*) const /home/phaneesh/IREE/iree/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:215:11
#31 0x00007f902eb94af1 mlir::iree_compiler::runIreecMain(int, char**) /home/phaneesh/IREE/iree/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:0:10
#32 0x00007f9029629d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
#33 0x00007f9029629e40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
#34 0x0000558f3cd756c5 _start (../IREE/iree-build/tools/iree-compile+0x16c5)
This is an issue with our distribution using tensor dimensions without any indirection. There's some things we can do if this is a fundamental limitation (like, the workgroup count xyz can't cover all of the workgroups we need) but I suspect this is the issue we've had before where we're just taking dim 1 and shoving it into workgroup count y.
What happened?
Context: this is a min repro from compilation of llama2 model with 7B params in size int8 on vulkan on Nvidia A100 40G gpu (ubuntu22.04) the compilation proceeds on creation of vmfb without any errors but the result of executing the vmfb is zeros and nans. this repro is the first occurrence of zeros in the mlir.
the function:
the result in
%42
is what comes out to be zeros. I've validated that%2150
has non zero values and%2151 != %in_1652
.On compiling with a debug build some vulkan validation errors are shown as follows (although compilation to vmfb is successful):
llama2-debug-a100-vulkaninfo.txt
Steps to reproduce your issue
mlir
compile command :
run command:
What component(s) does this issue relate to?
No response
Version information
No response
Additional context
No response