cornell-zhang / heterocl

HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Heterogeneous Computing
https://cornell-zhang.github.io/heterocl/
Apache License 2.0
322 stars 92 forks source link

`hcl.assert` causing LLVM assertion failure #446

Closed hecmay closed 2 years ago

hecmay commented 2 years ago

This bug is reported from another HCL user - hcl.assert only functions when LLVM assertion flag is turned off. Once LLVM assertion flag is turned on, hcl.assert will result in some unexpected errors.

I tried to reproduce this issue after reinstalling HCL with LLVM assertion flag turned on, and here is what i got from one of our regression tests

(tests) [sx233@brg-zhang-xcel test_api_assert_cases]$ python basic_assert_tests.py 
python: /heterocl/build/pkgs/llvm/src/lib/IR/Globals.cpp:351: void llvm::GlobalVariable::setInitializer(llvm::Constant*): Assertion `InitVal->getType() == getValueType() && "Initializer type must match GlobalVariable type"' failed.
Aborted

# traces from gdb
Thread 1 "python" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7def859 in __GI_abort () at abort.c:79
#2  0x00007ffff7def729 in __assert_fail_base (fmt=0x7ffff7f85588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7fffd66e3b18 "InitVal->getType() == getValueType() && \"Initializer type must match GlobalVariable type\"", file=0x7fffd66e2fc0 "/heterocl/build/pkgs/llvm/src/lib/IR/Globals.cpp",
    line=350, function=<optimized out>) at assert.c:92
#3  0x00007ffff7e01006 in __GI___assert_fail (assertion=0x7fffd66e3b18 "InitVal->getType() == getValueType() && \"Initializer type must match GlobalVariable type\"", file=0x7fffd66e2fc0 "/heterocl/build/pkgs/llvm/src/lib/IR/Globals.cpp", line=350,
    function=0x7fffd66e3ad8 "void llvm::GlobalVariable::setInitializer(llvm::Constant*)") at assert.c:101
#4  0x00007fffd55b11b2 in llvm::GlobalVariable::setInitializer(llvm::Constant*) () from /heterocl/tvm/lib/libhcl.so
#5  0x00007fffd35d95da in TVM::codegen::CodeGenLLVM::AddFunctionInternal(TVM::LoweredFunc const&, bool) () from /heterocl/tvm/lib/libhcl.so
#6  0x00007fffd35e36f9 in TVM::codegen::CodeGenCPU::AddFunction(TVM::LoweredFunc const&) () from /heterocl/tvm/lib/libhcl.so
#7  0x00007fffd35ba021 in TVM::codegen::LLVMModuleNode::Init(TVM::Array<TVM::LoweredFunc, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) () from /heterocl/tvm/lib/libhcl.so

This specific error seems to be caused by a data type mismatching between an llvm global variable and its initializer here: https://github.com/cornell-zhang/heterocl/blob/master/tvm/src/codegen/llvm/codegen_llvm.cc#L162-L169

I tried to fix this data type mismatching error (when LLVM assertion is enabled), but some other assertion errors will pop up.

python: /.../heterocl/build/pkgs/llvm/src/lib/IR/Instructions.cpp:1130: void llvm::BranchInst::AssertOK(): Assertion `getCondition()->getType()->isIntegerTy(1) && "May only branch on boolean predicates!"' failed.
hecmay commented 2 years ago

@seanlatias

seanlatias commented 2 years ago

I don't understand. What do you mean by "function" when turning off the flag? Is the result still correct? Also, according to the error message, it is saying that branch only works on boolean variables. You may be using int or other data types.

hecmay commented 2 years ago

I don't understand. What do you mean by "function" when turning off the flag? Is the result still correct? Also, according to the error message, it is saying that branch only works on boolean variables. You may be using int or other data types.

I meant the llvm simulation cannot run through unless the assertion flag is turned off during installation.

Otherwise the llvm simulation exited with that error and I cannot get any result.