intel / graph-compiler

MLIR-based toolkit targeting intel heterogeneous hardware
Apache License 2.0
32 stars 15 forks source link

Avoid use DPS interface #356

Open WangJialei-A opened 1 month ago

WangJialei-A commented 1 month ago

Track #355

This change can fix the issue. @Menooker Do you know the reason why this fix works?

Menooker commented 1 month ago

getDPSInits and getOutputs looks similar to me in their implementations. Not sure why this will make a difference. ;(

ciyongch commented 1 month ago

Does this issue only happen with clang build?

WangJialei-A commented 1 month ago

Does this issue only happen with clang build?

Yes.

LongshengDu commented 1 month ago

getDPSInits and getOutputs looks similar to me in their implementations. Not sure why this will make a difference. ;(

Yes, really odd, I checked the .td file and it seems like they have similar interface.

ciyongch commented 1 month ago

Let's try to use memory sanitizer to see if there's any out-of-bound access issue before merging it?

kurapov-peter commented 1 month ago

Let's try to use memory sanitizer to see if there's any out-of-bound access issue before merging it?

Agree, this might mask a real issue.

ciyongch commented 1 month ago

I did some debug with memory sanitizer and valgrind, but didn't find the root cause yet, here's some tracing log. 1) memory sanitizer

$ bin/gc-opt --split-input-file --deep-tile-contraction-op /home/ciyong/workspace/graph-compiler/test/mlir/test/gc/Transforms/deepTileContractionNamedOp.mlir | /home/ciyong/workspace/llvm-project/llvm_install/bin/FileCheck /home/ciyong/workspace/graph-compiler/test/mlir/test/gc/Transforms/deepTileContractionNamedOp.mlir
Uninitialized bytes in MemcmpInterceptorCommon at offset 0 inside [0x707000000880, 7)
==1914387==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x56124d9dfc0c in memcmp /home/ciyong/workspace/llvm-project/compiler-rt/lib/msan/../sanitizer_common/sanitizer_common_interceptors.inc:878:33
    #1 0x56124d9dfc0c in memcmp /home/ciyong/workspace/llvm-project/compiler-rt/lib/msan/../sanitizer_common/sanitizer_common_interceptors.inc:873:1
    #2 0x5612500e2606 in std::char_traits<char>::compare(char const*, char const*, unsigned long) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/char_traits.h:389:25
    #3 0x5612500e2606 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::compare(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) const /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/basic_string.h:2879:32
    #4 0x5612500e2606 in bool std::operator<<char, std::char_traits<char>, std::allocator<char>>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/basic_string.h:6343:27
    #5 0x5612500e2606 in std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) const /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/stl_function.h:400:20
    #6 0x5612500e2606 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, std::pair<mlir::TypeID, std::function<mlir::Dialect* (mlir::MLIRContext*)>>>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, std::pair<mlir::TypeID, std::function<mlir::Dialect* (mlir::MLIRContext*)>>>>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, std::pair<mlir::TypeID, std::function<mlir::Dialect* (mlir::MLIRContext*)>>>>>::_M_get_insert_unique_pos(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/stl_tree.h:2071:35
    #7 0x5612500e2606 in std::pair<std::_Rb_tree_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, std::pair<mlir::TypeID, std::function<mlir::Dialect* (mlir::MLIRContext*)>>>>, bool> std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, std::pair<mlir::TypeID, std::function<mlir::Dialect* (mlir::MLIRContext*)>>>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, std::pair<mlir::TypeID, std::function<mlir::Dialect* (mlir::MLIRContext*)>>>>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, std::pair<mlir::TypeID, std::function<mlir::Dialect* (mlir::MLIRContext*)>>>>>::_M_emplace_unique<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::pair<mlir::TypeID, std::function<mlir::Dialect* (mlir::MLIRContext*)>>>>(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::pair<mlir::TypeID, std::function<mlir::Dialect* (mlir::MLIRContext*)>>>&&) /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/stl_tree.h:2389:43
    #8 0x5612500e2951 in _ZNSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4pairIN4mlir6TypeIDESt8functionIFPNS7_7DialectEPNS7_11MLIRContextEEEESt4lessIS5_ESaIS6_IKS5_SG_EEE6insertIS6_IS5_SG_EEENSt9enable_ifIXsrSt16is_constructibleISK_JT_EE5valueES6_ISt17_Rb_tree_iteratorISK_EbEE4typeEOSR_ /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/stl_map.h:817:33
    #9 0x5612500e2951 in mlir::DialectRegistry::insert(mlir::TypeID, llvm::StringRef, std::function<mlir::Dialect* (mlir::MLIRContext*)> const&) /home/ciyong/workspace/llvm-project/mlir/lib/IR/Dialect.cpp:228:34
    #10 0x56124da2a15e in main (/nfs/ciyong/graph-compiler/build_clang/bin/gc-opt+0x1d3d15e)
    #11 0x7f22c4f85ca2 in __libc_start_main (/lib64/libc.so.6+0x3aca2) (BuildId: 72a85c154e9e2d2e7ad0bb314a6d1c17719aae5c)
    #12 0x56124d9a448d in _start (/nfs/ciyong/graph-compiler/build_clang/bin/gc-opt+0x1cb748d)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ciyong/workspace/llvm-project/mlir/lib/IR/Dialect.cpp:228:34 in mlir::DialectRegistry::insert(mlir::TypeID, llvm::StringRef, std::function<mlir::Dialect* (mlir::MLIRContext*)> const&)
Exiting
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/ciyong/workspace/llvm-project/llvm_install/bin/FileCheck 

2) valgrind

valgrind --tool=memcheck --leak-check=full --track-origins=yes  bin/gc-opt --split-input-file --deep-tile-contraction-op /home/ciyong/workspace/graph-compiler/test/mlir/test/gc/Transforms/deepTileContractionNamedOp.mlir | /home/ciyong/workspace/llvm-project/llvm_install/bin/FileCheck /home/ciyong/workspace/graph-compiler/test/mlir/test/gc/Transforms/deepTileContractionNamedOp.mlir
==1984333== Memcheck, a memory error detector
==1984333== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1984333== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==1984333== Command: bin/gc-opt --split-input-file --deep-tile-contraction-op /home/ciyong/workspace/graph-compiler/test/mlir/test/gc/Transforms/deepTileContractionNamedOp.mlir
==1984333== 
==1984333== Invalid read of size 8
==1984333==    at 0x2EDF440: mlir::gc::(anonymous namespace)::DeepTileMatmul::innerBodyGeneration(mlir::RewriterBase&, mlir::linalg::LinalgOp, llvm::ArrayRef<llvm::ArrayRef<long> >, llvm::ArrayRef<llvm::SmallVector<mlir::gc::DimType, 12u> >, mlir::gc::(anonymous namespace)::DeepTileMatmul::innerBodyGenerationOption&) const (in /nfs/ciyong/graph-compiler/build_clang/bin/gc-opt)
==1984333==    by 0x2ED7D67: mlir::gc::(anonymous namespace)::DeepTileMatmul::matchAndRewrite(mlir::linalg::LinalgOp, mlir::PatternRewriter&) const (in /nfs/ciyong/graph-compiler/build_clang/bin/gc-opt)
==1984333==    by 0x30E67B7: mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>)::{lambda()#1}::operator()() const (PatternApplicator.cpp:212)
==1984333==    by 0x30E6D1A: callback_fn<mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, mlir::function_ref<bool(const mlir::Pattern&)>, mlir::function_ref<void(const mlir::Pattern&)>, mlir::function_ref<llvm::LogicalResult(const mlir::Pattern&)>)::<lambda()> > (STLFunctionalExtras.h:45)
==1984333==    by 0x30E6D1A: operator() (STLFunctionalExtras.h:68)
==1984333==    by 0x30E6D1A: executeAction<mlir::ApplyPatternAction, const mlir::Pattern&> (MLIRContext.h:275)
==1984333==    by 0x30E6D1A: mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>) (PatternApplicator.cpp:195)
==1984333==    by 0x30ABC48: (anonymous namespace)::GreedyPatternRewriteDriver::processWorklist() (GreedyPatternRewriteDriver.cpp:615)
==1984333==    by 0x30AF735: operator() (GreedyPatternRewriteDriver.cpp:874)
==1984333==    by 0x30AF735: callback_fn<(anonymous namespace)::RegionPatternRewriteDriver::simplify(bool*) &&::<lambda()> > (STLFunctionalExtras.h:45)
==1984333==    by 0x30AF735: operator() (STLFunctionalExtras.h:68)
==1984333==    by 0x30AF735: executeAction<(anonymous namespace)::GreedyPatternRewriteIteration, long int&> (MLIRContext.h:275)
==1984333==    by 0x30AF735: simplify (GreedyPatternRewriteDriver.cpp:872)
==1984333==    by 0x30AF735: mlir::applyPatternsAndFoldGreedily(mlir::Region&, mlir::FrozenRewritePatternSet const&, mlir::GreedyRewriteConfig, bool*) (GreedyPatternRewriteDriver.cpp:919)
==1984333==    by 0x2ED739A: mlir::gc::(anonymous namespace)::DeepTileContractionOp::runOnOperation() (in /nfs/ciyong/graph-compiler/build_clang/bin/gc-opt)
==1984333==    by 0x3114F25: operator() (Pass.cpp:526)
==1984333==    by 0x3114F25: callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::<lambda()> > (STLFunctionalExtras.h:45)
==1984333==    by 0x3114F25: operator() (STLFunctionalExtras.h:68)
==1984333==    by 0x3114F25: executeAction<mlir::PassExecutionAction, mlir::Pass&> (MLIRContext.h:275)
==1984333==    by 0x3114F25: mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (Pass.cpp:520)
==1984333==    by 0x31154B0: mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (Pass.cpp:592)
==1984333==    by 0x3115922: mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::{lambda(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)#1}::operator()(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&) const (Pass.cpp:812)
==1984333==    by 0x31140B4: failableParallelForEach<__gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::<lambda(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)>&> (Threading.h:46)
==1984333==    by 0x31140B4: failableParallelForEach<std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo>&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::<lambda(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)>&> (Threading.h:92)
==1984333==    by 0x31140B4: mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) (Pass.cpp:822)
==1984333==    by 0x3114B5A: operator() (Pass.cpp:524)
==1984333==    by 0x3114B5A: callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::<lambda()> > (STLFunctionalExtras.h:45)
==1984333==    by 0x3114B5A: operator() (STLFunctionalExtras.h:68)
==1984333==    by 0x3114B5A: executeAction<mlir::PassExecutionAction, mlir::Pass&> (MLIRContext.h:275)
==1984333==    by 0x3114B5A: mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (Pass.cpp:520)
==1984333==  Address 0xbff5950 is 144 bytes inside a block of size 240 free'd
==1984333==    at 0x4C39A03: free (vg_replace_malloc.c:755)
==1984333==    by 0x32BE971: operator() (std_function.h:590)
==1984333==    by 0x32BE971: mlir::RewriterBase::eraseOp(mlir::Operation*) (PatternMatch.cpp:230)
==1984333==    by 0x2EE4868: mlir::gc::(anonymous namespace)::setStaticSizeForInsertSliceOp(mlir::RewriterBase&, mlir::Operation*, mlir::Value, llvm::SmallVector<long, 6u>) (in /nfs/ciyong/graph-compiler/build_clang/bin/gc-opt)
==1984333==    by 0x2EDF3FA: mlir::gc::(anonymous namespace)::DeepTileMatmul::innerBodyGeneration(mlir::RewriterBase&, mlir::linalg::LinalgOp, llvm::ArrayRef<llvm::ArrayRef<long> >, llvm::ArrayRef<llvm::SmallVector<mlir::gc::DimType, 12u> >, mlir::gc::(anonymous namespace)::DeepTileMatmul::innerBodyGenerationOption&) const (in /nfs/ciyong/graph-compiler/build_clang/bin/gc-opt)
==1984333==    by 0x2ED7D67: mlir::gc::(anonymous namespace)::DeepTileMatmul::matchAndRewrite(mlir::linalg::LinalgOp, mlir::PatternRewriter&) const (in /nfs/ciyong/graph-compiler/build_clang/bin/gc-opt)
==1984333==    by 0x30E67B7: mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>)::{lambda()#1}::operator()() const (PatternApplicator.cpp:212)
==1984333==    by 0x30E6D1A: callback_fn<mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, mlir::function_ref<bool(const mlir::Pattern&)>, mlir::function_ref<void(const mlir::Pattern&)>, mlir::function_ref<llvm::LogicalResult(const mlir::Pattern&)>)::<lambda()> > (STLFunctionalExtras.h:45)
==1984333==    by 0x30E6D1A: operator() (STLFunctionalExtras.h:68)
==1984333==    by 0x30E6D1A: executeAction<mlir::ApplyPatternAction, const mlir::Pattern&> (MLIRContext.h:275)
==1984333==    by 0x30E6D1A: mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>) (PatternApplicator.cpp:195)
==1984333==    by 0x30ABC48: (anonymous namespace)::GreedyPatternRewriteDriver::processWorklist() (GreedyPatternRewriteDriver.cpp:615)
==1984333==    by 0x30AF735: operator() (GreedyPatternRewriteDriver.cpp:874)
==1984333==    by 0x30AF735: callback_fn<(anonymous namespace)::RegionPatternRewriteDriver::simplify(bool*) &&::<lambda()> > (STLFunctionalExtras.h:45)
==1984333==    by 0x30AF735: operator() (STLFunctionalExtras.h:68)
==1984333==    by 0x30AF735: executeAction<(anonymous namespace)::GreedyPatternRewriteIteration, long int&> (MLIRContext.h:275)
==1984333==    by 0x30AF735: simplify (GreedyPatternRewriteDriver.cpp:872)
==1984333==    by 0x30AF735: mlir::applyPatternsAndFoldGreedily(mlir::Region&, mlir::FrozenRewritePatternSet const&, mlir::GreedyRewriteConfig, bool*) (GreedyPatternRewriteDriver.cpp:919)
==1984333==    by 0x2ED739A: mlir::gc::(anonymous namespace)::DeepTileContractionOp::runOnOperation() (in /nfs/ciyong/graph-compiler/build_clang/bin/gc-opt)
==1984333==    by 0x3114F25: operator() (Pass.cpp:526)
==1984333==    by 0x3114F25: callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::<lambda()> > (STLFunctionalExtras.h:45)
==1984333==    by 0x3114F25: operator() (STLFunctionalExtras.h:68)
==1984333==    by 0x3114F25: executeAction<mlir::PassExecutionAction, mlir::Pass&> (MLIRContext.h:275)
==1984333==    by 0x3114F25: mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (Pass.cpp:520)
==1984333==    by 0x31154B0: mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (Pass.cpp:592)
==1984333==  Block was alloc'd at
==1984333==    at 0x4C370A5: malloc (vg_replace_malloc.c:380)
==1984333==    by 0x32B4A09: mlir::Operation::create(mlir::Location, mlir::OperationName, mlir::TypeRange, mlir::ValueRange, mlir::DictionaryAttr, mlir::OpaqueProperties, mlir::BlockRange, unsigned int) (Operation.cpp:114)
==1984333==    by 0x32B5021: mlir::Operation::create(mlir::Location, mlir::OperationName, mlir::TypeRange, mlir::ValueRange, mlir::NamedAttrList&&, mlir::OpaqueProperties, mlir::BlockRange, unsigned int) (Operation.cpp:75)
==1984333==    by 0x32B506B: mlir::Operation::create(mlir::Location, mlir::OperationName, mlir::TypeRange, mlir::ValueRange, mlir::NamedAttrList&&, mlir::OpaqueProperties, mlir::BlockRange, mlir::RegionRange) (Operation.cpp:58)
==1984333==    by 0x32B53B6: mlir::Operation::create(mlir::OperationState const&) (Operation.cpp:36)
==1984333==    by 0x321980B: mlir::OpBuilder::create(mlir::OperationState const&) (Builders.cpp:485)
==1984333==    by 0x2EC35E3: mlir::tensor::InsertSliceOp mlir::OpBuilder::create<mlir::tensor::InsertSliceOp, mlir::Value&, mlir::Value&, llvm::SmallVector<mlir::OpFoldResult, 6u>&, llvm::SmallVector<mlir::OpFoldResult, 6u>&, llvm::SmallVector<mlir::OpFoldResult, 6u>&>(mlir::Location, mlir::Value&, mlir::Value&, llvm::SmallVector<mlir::OpFoldResult, 6u>&, llvm::SmallVector<mlir::OpFoldResult, 6u>&, llvm::SmallVector<mlir::OpFoldResult, 6u>&) (Builders.h:517)
==1984333==    by 0x2EC5556: generateLoopNestUsingForOp (TileUsingInterface.cpp:443)
==1984333==    by 0x2EC5556: generateLoopNest(mlir::RewriterBase&, mlir::Location, mlir::scf::SCFTilingOptions const&, llvm::ArrayRef<mlir::Range>, llvm::ArrayRef<mlir::OpFoldResult>, llvm::ArrayRef<mlir::OpFoldResult>, mlir::ValueRange, std::function<llvm::LogicalResult (mlir::RewriterBase&, mlir::Location, mlir::ValueRange, mlir::ValueRange, llvm::SmallVector<mlir::Value, 6u>&, llvm::SmallVector<llvm::SmallVector<mlir::OpFoldResult, 6u>, 1u>&, llvm::SmallVector<llvm::SmallVector<mlir::OpFoldResult, 6u>, 1u>&)>, llvm::SmallVector<mlir::LoopLikeOpInterface, 3u>&) (TileUsingInterface.cpp:558)
==1984333==    by 0x2ECA92F: mlir::scf::tileUsingSCF(mlir::RewriterBase&, mlir::TilingInterface, mlir::scf::SCFTilingOptions const&) (TileUsingInterface.cpp:903)
==1984333==    by 0x2EDC01F: mlir::gc::(anonymous namespace)::DeepTileMatmul::outerLoopGeneration(mlir::RewriterBase&, mlir::linalg::LinalgOp, mlir::gc::MatmulConfig, bool) (in /nfs/ciyong/graph-compiler/build_clang/bin/gc-opt)
==1984333==    by 0x2ED7C54: mlir::gc::(anonymous namespace)::DeepTileMatmul::matchAndRewrite(mlir::linalg::LinalgOp, mlir::PatternRewriter&) const (in /nfs/ciyong/graph-compiler/build_clang/bin/gc-opt)
==1984333==    by 0x30E67B7: mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>)::{lambda()#1}::operator()() const (PatternApplicator.cpp:212)
==1984333== 
gc-opt: /home/ciyong/workspace/llvm-project/llvm_install/include/llvm/Support/Casting.h:566: decltype(auto) llvm::cast(const From &) [To = mlir::ShapedType, From = mlir::Type]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.  Program arguments: bin/gc-opt --split-input-file --deep-tile-contraction-op /home/ciyong/workspace/graph-compiler/test/mlir/test/gc/Transforms/deepTileContractionNamedOp.mlir
 #0 0x000000000597ad28 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ciyong/workspace/llvm-project/llvm/lib/Support/Unix/Signals.inc:727:3
 #1 0x0000000005978a2c llvm::sys::RunSignalHandlers() /home/ciyong/workspace/llvm-project/llvm/lib/Support/Signals.cpp:105:20
 #2 0x0000000005978d6e SignalHandler(int) /home/ciyong/workspace/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x0000000004e59ce0 __restore_rt (/lib64/libpthread.so.0+0x12ce0)
 #4 0x000000000afd3a4f raise (/lib64/libc.so.6+0x4ea4f)
 #5 0x000000000afa6db5 abort (/lib64/libc.so.6+0x21db5)
 #6 0x000000000afa6c89 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21c89)
 #7 0x000000000afcc3a6 (/lib64/libc.so.6+0x473a6)
 #8 0x0000000002ed34dd mlir::linalgx::isGenericPackedMatmulOpImpl(mlir::linalg::GenericOp, mlir::linalgx::PackingType) (bin/gc-opt+0x2dcb4dd)
 #9 0x0000000002ed795d mlir::gc::(anonymous namespace)::DeepTileMatmul::matchAndRewrite(mlir::linalg::LinalgOp, mlir::PatternRewriter&) const DeepTileContractionOp.cpp:0:0