mewmew / uc

A compiler for the µC language.
58 stars 5 forks source link

irgen: Missing terminator, false positive #70

Closed mewmew closed 8 years ago

mewmew commented 8 years ago

Missing terminator, false positive

testdata/noisy/medium/fac.c

uclang testdata/noisy/medium/fac.c 
panic: unable to finalize current basic block of function body; expected void return since terminator was missing, got i32

github.com/mewmew/uc/irgen.(*Function).endBody(0xc42000ac80, 0xc42000ac80, 0x61cf00)
    /home/u/Desktop/go/src/github.com/mewmew/uc/irgen/build.go:99 +0x32d

testdata/noisy/medium/fac-b.c

uclang testdata/noisy/medium/fac-b.c 
panic: unable to finalize current basic block of function body; expected void return since terminator was missing, got i32

github.com/mewmew/uc/irgen.(*Function).endBody(0xc42000b500, 0xc42000b500, 0x61cf00)
    /home/u/Desktop/go/src/github.com/mewmew/uc/irgen/build.go:99 +0x32d
mewmew commented 8 years ago

The semantic checker verifies that functions always return from each branch, but in the LLVM IR generator, the same validation has not yet been implemented. Re-opening this issue for now. The implementation needs more thought.

The identified issue of returning from if and while bodies has already been fixed, however.

mewmew commented 8 years ago

The semantic checker verifies that functions always return from each branch, but in the LLVM IR generator, the same validation has not yet been implemented. Re-opening this issue for now. The implementation needs more thought.

Clang solves this by allocating a dedicated return value and always returning from the last basic block; e.g.

define i32 @f() {
0:
    %1 = alloca i32                      ; allocate dedicated return value
    %a = alloca i32
    br label %2
2:
    %3 = load i32, i32* %a
    %4 = icmp ne i32 %3, 0
    br i1 %4, label %5, label %6
5:
    store i32 1, i32* %1
    br label %7
6:
    store i32 2, i32* %1
    br label %7
7:
    %8 = load i32, i32* %1
    ret i32 %8                           ; load and return the dedicated return value
}
mewmew commented 8 years ago

Proposed solution.

Keep our current implementation and generate an unreachable instruction for the last basic block, as it is indeed unreachable as ensured by the semantic analysis checker.

What do you think @sangisos? The benefit of this is that the implementation is trivial, the generated LLVM IR is easier to read and maps closer to how the AST looks like. The main disadvantage, which is not to be taken lightly, is that the output would differ from Clang.

mewmew commented 8 years ago

Keep our current implementation and generate an unreachable instruction for the last basic block, as it is indeed unreachable as ensured by the semantic analysis checker.

What do you think @sangisos? The benefit of this is that the implementation is trivial, the generated LLVM IR is easier to read and maps closer to how the AST looks like. The main disadvantage, which is not to be taken lightly, is that the output would differ from Clang.

To evaluate the two approaches, I've started with implementing the easiest one in d38a705, which emits unreachable instructions for unreachable basic blocks. We may decide to implement the more involved approach which mimics how Clang does it and use dedicated return values. The two implementations may then be evaluated against one another in terms of simplicity, consistency, ease of use, and other merits.

mewmew commented 8 years ago

Closing this issue for now, but marking it as a future ambition, so that we may implement the second approach and evaluate them both when time permits.