linalg.depthwise_conv_2d_nchw_chw' op should not remain after vectorization

powderluv commented 1 year ago

What happened?

Recently we have been seeing an error like:

E         iree.compiler.tools.binaries.CompilerToolError: Error invoking IREE compiler tool iree-compile
E         Diagnostics:
E         /data/anush/actions-runner/_work/SHARK/SHARK/shark.venv/lib/python3.11/site-packages/torch/nn/modules/conv.py:459:0: error: 'linalg.depthwise_conv_2d_nchw_chw' op should not remain after vectorization
E         /data/anush/actions-runner/_work/SHARK/SHARK/shark.venv/lib/python3.11/site-packages/torch/nn/modules/conv.py:459:0: error: Failures have been detected while processing an MLIR pass pipeline
E         /data/anush/actions-runner/_work/SHARK/SHARK/shark.venv/lib/python3.11/site-packages/torch/nn/modules/conv.py:459:0: note: Pipeline failed while executing [`mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_1, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_10, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_12, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `SPIRVLowerExecutableTarget` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_16, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_17, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_19, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `SPIRVLowerExecutableTarget` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `SPIRVLowerExecutableTarget` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_21, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_22, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `SPIRVLowerExecutableTarget` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `SPIRVLowerExecutableTarget` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_25, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `SPIRVLowerExecutableTarget` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_28, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_29, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `SPIRVLowerExecutableTarget` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `SPIRVLowerExecutableTarget` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `mlir::iree_compiler::IREE::HAL::TranslateExecutablesPass` on 'hal.executable' operation: @forward_dispatch_30, `mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `Canonicalizer` on 'builtin.module' operation, `SPIRVLowerExecutableTarget` on 'hal.executable.variant' operation: @vulkan_spirv_fb, `FoldMemRefAliasOps` on 'func.func' operation: @forward_dispatch_16_batch_matmul_1x256x32x32_f32, `FoldMemRefAliasOps` on 'func.func' operation: @forward_dispatch_22_batch_matmul_1x16384x32x32_f32, `ConvertToSPIRV` on 'builtin.module' operation, `OptimizeVectorTransfer` on 'func.func' operation: @forward_dispatch_1_conv_2d_nchw_fchw_1x32x128x128x3x7x7_f32, `OptimizeVectorTransfer` on 'func.func' operation: @forward_dispatch_21_batch_matmul_1x16384x32x256_f32, `CSE` on 'builtin.module' operation, `SPIRVVectorizeLoadStore` on 'builtin.module' operation, `FoldMemRefAliasOps` on 'func.func' operation: @forward_dispatch_25_batch_matmul_1x16384x128x32_f32, `SPIRVVectorize` on 'func.func' operation: @forward_dispatch_28_depthwise_conv_2d_nchw_chw_1x128x128x128x3x3_f32, `CSE` on 'builtin.module' operation, `GPUTensorAlloc` on 'func.func' operation: @forward_dispatch_30_batch_matmul_1x16384x32x128_f32]: reproducer generated at `./nvidia_mit-b0_torch_True_vulkanuy57uyhd/core-reproducer.mlir`
E         /data/anush/actions-runner/_work/SHARK/SHARK/shark.venv/lib/python3.11/site-packages/torch/nn/modules/conv.py:459:0: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"vulkan", "vulkan-spirv-fb", {spirv.target_env = #spirv.target_env<#spirv.vce<v1.6, [Shader, Float64, Float16, Int64, Int16, Int8, StorageBuffer16BitAccess, StorageUniform16, StoragePushConstant16, StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, StoragePushConstant8, GroupNonUniform, GroupNonUniformVote, GroupNonUniformArithmetic, GroupNonUniformBallot, GroupNonUniformShuffle, GroupNonUniformShuffleRelative, GroupNonUniformClustered, GroupNonUniformQuad, VariablePointers, VariablePointersStorageBuffer, DotProduct, DotProductInputAll, DotProductInput4x8BitPacked, DotProductInput4x8Bit, CooperativeMatrixNV], [SPV_KHR_16bit_storage, SPV_KHR_8bit_storage, SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers, SPV_NV_cooperative_matrix]>, api=Vulkan, NVIDIA:DiscreteGPU, #spirv.resource_limits<max_compute_shared_memory_size = 49152, max_compute_workgroup_invocations = 1024, max_compute_workgroup_size = [1024, 1024, 1024], min_subgroup_size = 32, max_subgroup_size = 32, cooperative_matrix_properties_nv = [#spirv.coop_matrix_props<m_size = 8, n_size = 8, k_size = 32, a_type = i8, b_type = i8, c_type = i32, result_type = i32, scope = <Subgroup>>, #spirv.coop_matrix_props<m_size = 16, n_size = 16, k_size = 16, a_type = f16, b_type = f16, c_type = f16, result_type = f16, scope = <Subgroup>>, #spirv.coop_matrix_props<m_size = 16, n_size = 16, k_size = 16, a_type = f16, b_type = f16, c_type = f32, result_type = f32, scope = <Subgroup>>]>>}>
E         /data/anush/actions-runner/_work/SHARK/SHARK/shark.venv/lib/python3.11/site-packages/torch/nn/modules/conv.py:459:0: error: failed to serialize executables
E         
E         
E         Invoked with:
E          iree-compile /data/anush/actions-runner/_work/SHARK/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile - --iree-input-type=auto --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=vulkan --iree-llvmcpu-embedded-linker-path=/data/anush/actions-runner/_work/SHARK/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --mlir-pass-pipeline-crash-reproducer=./nvidia_mit-b0_torch_True_vulkanuy57uyhd/core-reproducer.mlir --iree-llvmcpu-target-cpu-features=host --iree-vulkan-target-env=#vk.target_env<v1.3, r(120), [VK_KHR_16bit_storage, VK_KHR_8bit_storage, VK_KHR_shader_float16_int8, VK_KHR_spirv_1_4, VK_KHR_storage_buffer_storage_class, VK_KHR_variable_pointers, VK_EXT_subgroup_size_control, VK_NV_cooperative_matrix], NVIDIA:DiscreteGPU, #vk.caps< maxComputeSharedMemorySize = 49152, maxComputeWorkGroupInvocations = 1024, maxComputeWorkGroupSize = dense<[1024, 1024, 1024]>: vector<3xi32>, subgroupSize = 32, subgroupFeatures = 255: i32, minSubgroupSize = 32, maxSubgroupSize = 32, shaderFloat16 = unit, shaderFloat64 = unit, shaderInt8 = unit, shaderInt16 = unit, shaderInt64 = unit, storageBuffer16BitAccess = unit, storagePushConstant16 = unit, uniformAndStorageBuffer16BitAccess = unit, storageBuffer8BitAccess = unit, storagePushConstant8 = unit, uniformAndStorageBuffer8BitAccess = unit, variablePointers = unit, variablePointersStorageBuffer = unit, cooperativeMatrixPropertiesNV = [#vk.coop_matrix_props<mSize = 8, nSize = 8, kSize = 32, aType = i8, bType = i8, cType = i32, resultType = i32, scope = #vk.scope<Subgroup>>, #vk.coop_matrix_props<mSize = 16, nSize = 16, kSize = 16, aType = f16, bType = f16, cType = f16, resultType = f16, scope = #vk.scope<Subgroup>>, #vk.coop_matrix_props<mSize = 16, nSize = 16, kSize = 16, aType = f16, bType = f16, cType = f32, resultType = f32, scope = #vk.scope<Subgroup>>], shaderIntegerDotProduct = unit >> --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs
E         
E         Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.

Will triage more but just wanted to file the issue in case it is known.

CI failures: https://github.com/nod-ai/SHARK/actions/runs/5171513432/jobs/9315382588

Steps to reproduce your issue

Go to '...'
Click on '....'
Scroll down to '....'
See error

What component(s) does this issue relate to?

No response

Version information

No response

Additional context

No response

antiagainst commented 1 year ago

This check is newly enforced. I'd assume we can vectorize depthwise_conv_2d_nchw_chw no problem given the vectorization works on convolution op interface. Can we have the problematic dispatch so we can take a look?

allieculp commented 1 year ago

@powderluv Can you provide the dispatch? Any additional info so far?

monorimet commented 1 year ago

I have the .mlir for dispatch 28 (the failing dispatch) here: gs://shark-public/ean/reproducers/nvidia_mit-b0/module_forward_dispatch_28_vulkan_spirv_fb.mlir

The error can be reproduced with the following CLI input:

iree-compile   --iree-hal-target-backends=vulkan module_forward_dispatch_28_vulkan_spirv_fb.mlir -o
 mit-b0_dispatch_28.vmfb

powderluv commented 1 year ago

We should probably add one of these models to IREE's CI so we have coverage there and doesn't escape to the SHARK tests.

qcolombet commented 1 year ago

@antiagainst could you assigned a priority here so that we can considered that this has been screened?

antiagainst commented 1 year ago

The reason is that we are missing a pattern to convert 2-D conv to 1-D for vectorization for this specific format. See https://github.com/llvm/llvm-project/blob/2b7ded2/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp#L1746-L1750. We just need to add a similar pattern to the existing one for nhwc-hwc format.

I won't be able to come to this soon. @powderluv is there somebody in your team that can add it? Should be fairly straightforward given we have an example there.

allieculp commented 1 year ago

@powderluv Have to set this as a P2 on our side - if you have anyone to pick it up please feel free to assign!

iree-org / iree