Closed AmosLewis closed 2 days ago
The sharktank llama model (https://github.com/nod-ai/sharktank/issues/22) is failing with this same error. Stack with more context:
(sharktank) λ D:\dev\projects\iree-build\tools\iree-compile D:/tmp/open_llama_3b_v2/open-llama-3b-v2-f16.mlir --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu-features=host -o D:/tmp/open_llama_3b_v2/open-llama-3b-v2-f16_cpu.vmfb
Assertion failed: input.size() == permutation.size() && "expected input rank to equal permutation rank", file D:\dev\projects\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Utils/IndexingUtils.h, line 204
Please report issues to https://github.com/iree-org/iree/issues and include the crash backtrace.
Exception Code: 0x80000003
#0 0x00007ff601268e95 HandleAbort D:\dev\projects\iree\third_party\llvm-project\llvm\lib\Support\Windows\Signals.inc:425:0
#1 0x00007ffe7d561881 (C:\WINDOWS\System32\ucrtbase.dll+0x71881)
#2 0x00007ffe7d562851 (C:\WINDOWS\System32\ucrtbase.dll+0x72851)
#3 0x00007ffe7d56426e (C:\WINDOWS\System32\ucrtbase.dll+0x7426e)
#4 0x00007ffe7d564165 (C:\WINDOWS\System32\ucrtbase.dll+0x74165)
#5 0x00007ffe7d5644f1 (C:\WINDOWS\System32\ucrtbase.dll+0x744f1)
#6 0x00007ff6051bf70f mlir::applyPermutation<class llvm::SmallVector<__int64, 2>>(class llvm::ArrayRef<class llvm::SmallVector<__int64, 2>>, class llvm::ArrayRef<__int64>) D:\dev\projects\iree\third_party\llvm-project\mlir\include\mlir\Dialect\Utils\IndexingUtils.h:205:0
#7 0x00007ff6051b9c11 mlir::applyPermutation<class llvm::SmallVector<__int64, 2>>(class llvm::SmallVectorImpl<class llvm::SmallVector<__int64, 2>> const &, class llvm::ArrayRef<__int64>) D:\dev\projects\iree\third_party\llvm-project\mlir\include\mlir\Dialect\Utils\IndexingUtils.h:214:0
#8 0x00007ff6051b1c2b mlir::applyPermutationToVector<class llvm::SmallVector<__int64, 2>, 1>(class llvm::SmallVector<class llvm::SmallVector<__int64, 2>, 1> &, class llvm::ArrayRef<__int64>) D:\dev\projects\iree\third_party\llvm-project\mlir\include\mlir\Dialect\Utils\IndexingUtils.h:225:0
#9 0x00007ff606741eab `anonymous namespace'::applyPermutationAndReindexReassoc D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Dialect\Linalg\Transforms\DataLayoutPropagation.cpp:610:0
#10 0x00007ff60674268d `anonymous namespace'::bubbleUpPackOpThroughCollapseShape D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Dialect\Linalg\Transforms\DataLayoutPropagation.cpp:687:0
#11 0x00007ff606743bef `anonymous namespace'::BubbleUpPackOpThroughReshapeOp::matchAndRewrite D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Dialect\Linalg\Transforms\DataLayoutPropagation.cpp:849:0
#12 0x00007ff60409bbe4 mlir::detail::OpOrInterfaceRewritePatternBase<class mlir::tensor::PackOp>::matchAndRewrite(class mlir::Operation *, class mlir::PatternRewriter &) const D:\dev\projects\iree\third_party\llvm-project\mlir\include\mlir\IR\PatternMatch.h:332:0
#13 0x00007ff60560e8eb <lambda_033eed04a8a10a7b33015298d48d216a>::operator() D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Rewrite\PatternApplicator.cpp:212:0
#14 0x00007ff60560c275 mlir::PatternApplicator::matchAndRewrite(class mlir::Operation *, class mlir::PatternRewriter &, class llvm::function_ref<(class mlir::Pattern const &)>, class llvm::function_ref<(class mlir::Pattern const &)>, class llvm::function_ref<(class mlir::Pattern const &)>) D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Rewrite\PatternApplicator.cpp:233:0
#15 0x00007ff60448f91e `anonymous namespace'::GreedyPatternRewriteDriver::processWorklist D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Transforms\Utils\GreedyPatternRewriteDriver.cpp:617:0
#16 0x00007ff6044920e2 llvm::function_ref<void __cdecl(void)>::callback_fn<<lambda_56efa1fe2231a48e07ce9bd5369059af> > D:\dev\projects\iree\third_party\llvm-project\llvm\include\llvm\ADT\STLFunctionalExtras.h:45:0
#17 0x00007ff6044914ae `anonymous namespace'::RegionPatternRewriteDriver::simplify D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Transforms\Utils\GreedyPatternRewriteDriver.cpp:872:0
#18 0x00007ff60448d38e mlir::applyPatternsAndFoldGreedily(class mlir::Region &, class mlir::FrozenRewritePatternSet const &, class mlir::GreedyRewriteConfig, bool *) D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Transforms\Utils\GreedyPatternRewriteDriver.cpp:920:0
#19 0x00007ff6051e8d1d mlir::iree_compiler::GlobalOptimization::`anonymous namespace'::DataLayoutPropagationPass::runOnOperation D:\dev\projects\iree\compiler\src\iree\compiler\GlobalOptimization\DataLayoutPropagation.cpp:31:0
#20 0x00007ff60163ead0 llvm::function_ref<void __cdecl(void)>::callback_fn<<lambda_e8f8990a45bf3495636c03506b9db479> > D:\dev\projects\iree\third_party\llvm-project\llvm\include\llvm\ADT\STLFunctionalExtras.h:45:0
#21 0x00007ff601638637 mlir::detail::OpToOpPassAdaptor::run(class mlir::Pass *, class mlir::Operation *, class mlir::AnalysisManager, bool, unsigned int) D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Pass\Pass.cpp:533:0
#22 0x00007ff60163883d mlir::detail::OpToOpPassAdaptor::runPipeline(class mlir::OpPassManager &, class mlir::Operation *, class mlir::AnalysisManager, bool, unsigned int, class mlir::PassInstrumentor *, struct mlir::PassInstrumentation::PipelineParentInfo const *) D:\dev\projects\iree\third_party\llvm-project\mlir\lib\Pass\Pass.cpp:593:0
#23 0x00007ff60163fa5b <lambda_060c7f84c4de8022f660b122ba4cdde9>::operator() D:\dev\projects\iree\third_party\llvm-project\mlir\include\mlir\IR\Threading.h:62:0
#24 0x00007ff6016403db std::_Packaged_state<(void)>::_Call_immediate(void) C:\Program Files (x86)\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.31.31103\include\future:593:0
#25 0x00007ff60164054f std::_Deferred_async_state<void>::_Run_deferred_function(class std::unique_lock<class std::mutex> &) C:\Program Files (x86)\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.31.31103\include\future:659:0
#26 0x00007ff60163b02d std::_Associated_state<int>::_Wait(void) C:\Program Files (x86)\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.31.31103\include\future:223:0
#27 0x00007ff6030c6473 llvm::StdThreadPool::processTasks(class llvm::ThreadPoolTaskGroup *) D:\dev\projects\iree\third_party\llvm-project\llvm\lib\Support\ThreadPool.cpp:103:0
#28 0x00007ff6030c79d3 llvm::thread::ThreadProxy<std::tuple<<lambda_a09d07335cf06810bfcca9f8daa525ff> > > D:\dev\projects\iree\third_party\llvm-project\llvm\include\llvm\Support\thread.h:65:0
#29 0x00007ffe7d511bb2 (C:\WINDOWS\System32\ucrtbase.dll+0x21bb2)
#30 0x00007ffe7e987344 (C:\WINDOWS\System32\KERNEL32.DLL+0x17344)
#31 0x00007ffe7f9bcc91 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x4cc91)
Possible culprit: https://github.com/llvm/llvm-project/pull/93529
Confirmed - my issue does not occur with that PR reverted. Going to dig a bit deeper then comment there.
Putting a note here:
https://github.com/llvm/llvm-project/pull/96697 exposes the control of (producer, consumer) pair to controlFn API. https://github.com/iree-org/iree/pull/17740 contains the IREE fix for the upstream change. I have a local patch which re-enables the pack->expand_shape propagation in IREE.
Confirmed - my issue does not occur with that PR reverted. Going to dig a bit deeper then comment there.
mit-b0 issue goes away as well with revert this https://github.com/llvm/llvm-project/pull/93529
cd iree/third_party/llvm_project/
git revert a945f55d3e6af6be6648fb92a20c80e88e3fc2b2
cd SHARK-TestSuite/e2eshark
python ./run.py --torchmlirbuild ../../torch-mlir/build --tolerance 0.001 0.001 --cachedir ./huggingface_cache --ireebuild ../../iree-build -f pytorch -g models --mode onnx --report --tests pytorch/models/mit-b0 --torchtolinalg
Status report for run: test-run using mode:onnx todtype:default backend:llvm-cpu
| tests | model-run | onnx-import | torch-mlir | iree-compile | inference |
|:----------------------|:------------|:--------------|:-------------|:---------------|:------------|
| pytorch/models/mit-b0 | passed | passed | passed | passed | mismatch |
Fixed by https://github.com/llvm/llvm-project/pull/96732 need bump
What happened?
mit-b0 pytorch model to onnx failed https://github.com/nod-ai/SHARK-TestSuite/issues/270
pytorch/models/mobilebert-uncased pytorch/models/t5-base pytorch/models/t5-large has the same issue
gdb backtrakce output:
Steps to reproduce your issue
input ir: mit-b0.default.pytorch.linalg.elide.mlir
/home/chi/src/iree-build/tools/iree-compile --iree-input-demote-i64-to-i32 --iree-hal-target-backends=llvm-cpu mit-b0.default.pytorch.linalg.elide.mlir > mit-b0.default.vmfb 2>iree-compile.log
What component(s) does this issue relate to?
Compiler
Version information
iree candidate-20240624.934
Additional context
No response