Handling implicit returns

chsasank / llama.lisp

Lisp dialect designed for HPC and AI

GNU Lesser General Public License v2.1

15 stars 6 forks source link

Handling implicit returns #31

Closed GlowingScrewdriver closed 4 months ago

GlowingScrewdriver commented 5 months ago

One of LLVM's conditions for a well-formed program is that every basic block must be terminated. That means that the last instruction must be a terminating instruction, like a return, branch, or anything that transfers control away from the basic block.

There is one edge case that we need to look into: the end of a function. Currently, neither Brilisp nor C-Lisp handle this

If a program is written in such a way that there is no terminating instruction at the end of a function, neither Brilisp nor C-Lisp correct it, and it is eventually Clang/LLVM that throws the error. For example, trying to compile the following program:

(c-lisp
    (define ((fn int))
    (if #t
        (ret 0)
        (ret 1))))

will cause this (rather cryptic) error to be thrown by LLVM:

/tmp/tmp.5DCpGClmyt.ll:18:1: error: expected instruction opcode
}
^
1 error generated.

GlowingScrewdriver commented 5 months ago

I can think of a policy to handle this. Perhaps we can implement this in C-Lisp, and leave Brilisp unchanged.

Once a function body is ready, scan it and remove labels that will produce empty basic blocks
Insert a return statement at the end of the function if it returns void
Detect and report the problem if the function doesn't return void

chsasank commented 5 months ago

Let's just not allow implicit returns in c-lisp.

GlowingScrewdriver commented 5 months ago

Alright. But we still have to deal with things like

(define ((main int))
    (if #t
       (ret 1)
       (ret 0)))

Because if we don't deal with it, we are encouraging code like this:

(define ((main int))
    (if #t
       (ret 1))
    (ret 0))

Either we can scan the function body and remove empty labels, or we can implement an unreachable instruction like LLVM.

GlowingScrewdriver commented 5 months ago

LLVM's unreachable instruction marks a location in the code as unreachable; but it is also a terminating instruction. So, if we implemented unreachable, we could expect the programmer to be explicit and write code like this:

(define ((main int))
    (if #t
       (ret 1)
       (ret 0))
    (unreachable))

chsasank commented 5 months ago

Write equivalent C code, try compiling and put llvm IR output if it does

chsasank commented 4 months ago

Closed by #49