iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.86k stars 623 forks source link

Assert fails in OutlineMemoizeRegionsPass #18988

Open sogartar opened 3 weeks ago

sogartar commented 3 weeks ago

What happened?

The program

func.func @trace_args(%arg0: tensor<2xi32>) {
    flow.tensor.trace "debug_sink_test" = [
        %arg0: tensor<2xi32>
    ]
    return
}

Fails to compile with assert failure

iree-compile: repo/third_party/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:169: const T &llvm::ArrayRef<mlir::iree_compiler::IREE::Util::GlobalOpInterface>::front() const [T = mlir::iree_compiler::IREE::Util::GlobalOpInterface]: Assertion `!empty()' failed.
Please report issues to https://github.com/iree-org/iree/issues and include the crash backtrace.
Stack dump:
0.      Program arguments: iree-compile /home/bpetkant/ws/iree/experiments/trace-tensor/crash/program.mlir -o /home/bpetkant/ws/iree/experiments/trace-tensor/crash/program.vmfb
 #0 0x00007fd2af57b5bd llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/bpetkant/ws/iree/repo/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:11
 #1 0x00007fd2af57baab PrintStackTraceSignalHandler(void*) /home/bpetkant/ws/iree/repo/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:798:1
 #2 0x00007fd2af579b16 llvm::sys::RunSignalHandlers() /home/bpetkant/ws/iree/repo/third_party/llvm-project/llvm/lib/Support/Signals.cpp:105:5
 #3 0x00007fd2af57c265 SignalHandler(int) /home/bpetkant/ws/iree/repo/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #4 0x00007fd2a20b3520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x00007fd2a21079fc __pthread_kill_implementation ./nptl/./nptl/pthread_kill.c:44:76
 #6 0x00007fd2a21079fc __pthread_kill_internal ./nptl/./nptl/pthread_kill.c:78:10
 #7 0x00007fd2a21079fc pthread_kill ./nptl/./nptl/pthread_kill.c:89:10
 #8 0x00007fd2a20b3476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #9 0x00007fd2a20997f3 abort ./stdlib/./stdlib/abort.c:81:7
#10 0x00007fd2a209971b _nl_load_domain ./intl/./intl/loadmsgcat.c:1177:9
#11 0x00007fd2a20aae96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#12 0x00007fd2b2b4defc llvm::ArrayRef<mlir::iree_compiler::IREE::Util::GlobalOpInterface>::front() const /home/bpetkant/ws/iree/repo/third_party/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:0:7
#13 0x00007fd2b2b458ac mlir::iree_compiler::IREE::HAL::(anonymous namespace)::recursivelyEmitDeviceTree(mlir::Location, mlir::TypeRange, mlir::Value, llvm::ArrayRef<mlir::iree_compiler::IREE::Util::GlobalOpInterface>, std::function<llvm::SmallVector<mlir::Value, 6u> (mlir::iree_compiler::IREE::Util::GlobalOpInterface, mlir::OpBuilder&)>, std::function<llvm::SmallVector<mlir::Value, 6u> (mlir::OpBuilder&)>, mlir::OpBuilder&) /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/Dialect/HAL/Transforms/OutlineMemoizeRegions.cpp:385:49
#14 0x00007fd2b2b44768 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::createDeviceResultSelectTree(mlir::Location, mlir::TypeRange, mlir::Value, llvm::MapVector<mlir::iree_compiler::IREE::Util::GlobalOpInterface, llvm::SmallVector<mlir::iree_compiler::IREE::Util::GlobalOpInterface, 3u>, llvm::DenseMap<mlir::iree_compiler::IREE::Util::GlobalOpInterface, unsigned int, llvm::DenseMapInfo<mlir::iree_compiler::IREE::Util::GlobalOpInterface, void>, llvm::detail::DenseMapPair<mlir::iree_compiler::IREE::Util::GlobalOpInterface, unsigned int> >, llvm::SmallVector<std::pair<mlir::iree_compiler::IREE::Util::GlobalOpInterface, llvm::SmallVector<mlir::iree_compiler::IREE::Util::GlobalOpInterface, 3u> >, 0u> >&, mlir::OpBuilder&) /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/Dialect/HAL/Transforms/OutlineMemoizeRegions.cpp:398:3
#15 0x00007fd2b2b41fb2 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::createLookupFunc(mlir::iree_compiler::IREE::HAL::DeviceMemoizeOp, mlir::iree_compiler::IREE::HAL::(anonymous namespace)::MemoizeAnalysis&, mlir::iree_compiler::IREE::Util::FuncOp, llvm::MapVector<mlir::iree_compiler::IREE::Util::GlobalOpInterface, llvm::SmallVector<mlir::iree_compiler::IREE::Util::GlobalOpInterface, 3u>, llvm::DenseMap<mlir::iree_compiler::IREE::Util::GlobalOpInterface, unsigned int, llvm::DenseMapInfo<mlir::iree_compiler::IREE::Util::GlobalOpInterface, void>, llvm::detail::DenseMapPair<mlir::iree_compiler::IREE::Util::GlobalOpInterface, unsigned int> >, llvm::SmallVector<std::pair<mlir::iree_compiler::IREE::Util::GlobalOpInterface, llvm::SmallVector<mlir::iree_compiler::IREE::Util::GlobalOpInterface, 3u> >, 0u> >&, mlir::SymbolTable&, mlir::OpBuilder&) /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/Dialect/HAL/Transforms/OutlineMemoizeRegions.cpp:486:54
#16 0x00007fd2b2b3fd65 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::memoizeRegionOp(mlir::iree_compiler::IREE::HAL::DeviceMemoizeOp, mlir::iree_compiler::IREE::HAL::DeviceAnalysis&, mlir::SymbolTable&) /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/Dialect/HAL/Transforms/OutlineMemoizeRegions.cpp:564:7
#17 0x00007fd2b2b3f836 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::OutlineMemoizeRegionsPass::runOnOperation() /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/Dialect/HAL/Transforms/OutlineMemoizeRegions.cpp:593:25
#18 0x00007fd2afa587fb mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /home/bpetkant/ws/iree/repo/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:0:17
#19 0x00007fd2afa58795 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /home/bpetkant/ws/iree/repo/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:5
#20 0x00007fd2af48f169 llvm::function_ref<void ()>::operator()() const /home/bpetkant/ws/iree/repo/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:5
#21 0x00007fd2afa5b605 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /home/bpetkant/ws/iree/repo/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:281:3
#22 0x00007fd2afa53fa3 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /home/bpetkant/ws/iree/repo/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:532:17
#23 0x00007fd2afa54524 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /home/bpetkant/ws/iree/repo/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:592:16
#24 0x00007fd2afa55f69 mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /home/bpetkant/ws/iree/repo/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:905:10
#25 0x00007fd2afa55e92 mlir::PassManager::run(mlir::Operation*) /home/bpetkant/ws/iree/repo/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:885:60
#26 0x00007fd2af3d3c5a mlir::iree_compiler::embed::(anonymous namespace)::Invocation::runPipeline(iree_compiler_pipeline_t) /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:1008:27
#27 0x00007fd2af3d3533 ireeCompilerInvocationPipeline /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:1447:3
#28 0x00007fd2af94886e mlir::iree_compiler::runIreecMain(int, char**)::$_2::operator()(iree_compiler_source_t*) const /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:254:11
#29 0x00007fd2af947cbe mlir::iree_compiler::runIreecMain(int, char**) /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:355:9
#30 0x00007fd2af420d9b ireeCompilerRunMain /home/bpetkant/ws/iree/repo/compiler/src/iree/compiler/API/Internal/IREECompileToolEntryPoint.cpp:12:3
#31 0x000055ee7f9b27b2 main /home/bpetkant/ws/iree/repo/tools/iree-compile-main.cc:9:35
#32 0x00007fd2a209ad90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#33 0x00007fd2a209ae40 call_init ./csu/../csu/libc-start.c:128:20
#34 0x00007fd2a209ae40 __libc_start_main ./csu/../csu/libc-start.c:379:5
#35 0x000055ee7f9b26c5 _start (/home/bpetkant/ws/iree/build/Debug/tools/iree-compile+0x16c5)

Steps to reproduce your issue

Download trace-tensor-assert.zip unzip and run compile.sh.

What component(s) does this issue relate to?

No response

Version information

bb542eee65fa0a498963df1f2ee2f205a3dd8bd0

Additional context

Adding flag --iree-hal-target-device=llvm-cpu does not produce the error.

mvvsmk commented 3 weeks ago

Hey @sogartar I would like to work on this. I did some preliminary analysis for this bug, and here is what I found.

In the following function. The solver returns a globalPVS which is in a valid state but has a set size of 0 . This makes the function return a SmallVector of size 0 which later down the pipeline when recursivelyEmitDeviceTree calls itself by calling remainingDeviceGlobalOps.front() the ArrayRef results in a assert() saying there are no elements in the array.

https://github.com/iree-org/iree/blob/9c85e30df30d6efcf68a7a1b594e89322bd6085d/compiler/src/iree/compiler/Dialect/HAL/Analysis/DeviceAnalysis.cpp#L97-L111

Coming to why it works with --iree-hal-target-device=llvm-cpu :

Please let me know if I missed something or if you have any suggestions on how I could solve it. :)

sogartar commented 3 weeks ago

@mvvsmk I have not explored this. I just opened the issue so the problem does not get lost.

What does it mean if no target device is passed? Probably there is some default. If so when this pass executes someone should have already populated the default wherever it is needed. If there is no default somewhere earlier we should check and return a simpler error message.

benvanik commented 3 weeks ago

There's no default - a device must be specified.

When posting reproducers please post the MLIR files and include the commands in the issue description.