terralang / terra

Terra is a low-level system programming language that is embedded in and meta-programmed by the Lua programming language.
terralang.org
Other
2.72k stars 200 forks source link

LLVM optimizations due to undefined behavior are hard to debug (was: @llvm.trap() in generated IR) #205

Open Lupus opened 8 years ago

Lupus commented 8 years ago

Minimal reproducible test case:

local foo = macro(function()
        local struct some_struct {
                ptr: &int8
        }
        some_struct.methods.test_arr = macro(function(self)
                return `@self.ptr == 1
                -- the following line works
                -- return `self.ptr == nil
        end)

        local impl = symbol(&some_struct)
        return quote
                var [impl]
        in
                impl
        end
end)

terra traps()
        var f = foo()
        f:test_arr()
end

traps:compile()
traps:printpretty()
traps:disas()

Result:

../sample.t:19:         terra traps() : {}
../sample.t:20:             var f : &some_struct =
../sample.t:13:                                    let
                                                       var $0 : &some_struct
                                                   in

../sample.t:15:                                        $0
../sample.t:13:                                    end
../sample.t:6:
../sample.t:21:             [int32](@(@f).ptr) ==
../sample.t:6:                                    1
../sample.t:19:         end
definition      {} -> {}

define void @"$traps"() #0 {
entry:
  call void @llvm.trap()
  unreachable
}

assembly for function at address 0x7ff50d32d000
0x7ff50d32d000(+0):             push    rbp
0x7ff50d32d001(+1):             mov     rbp, rsp
0x7ff50d32d004(+4):             ud2

Is this an expected behavior? Terra code in the above example is not correct, but seeing trap in the resulting IR was quite surprising and took quite a while to debug down to uninitialized symbol. May be terra could catch such cases and report an error? Running terra -v reveals it's LLVM optimization actually due to UB.

define void @"$traps"() #0 {
entry:
  %f = alloca %some_struct*
  %"$0" = alloca %some_struct*
  %0 = load %some_struct*, %some_struct** %"$0"
  store %some_struct* %0, %some_struct** %f
  %1 = load %some_struct*, %some_struct** %f
  %2 = getelementptr %some_struct, %some_struct* %1, i32 0, i32 0
  %3 = load i8*, i8** %2
  %4 = load i8, i8* %3
  %5 = sext i8 %4 to i32
  %6 = icmp eq i32 %5, 1
  %7 = zext i1 %6 to i8
  ret void
}

optimizing scc containing: $traps
optimizing $traps

define void @"$traps"() #0 {
entry:
  call void @llvm.trap()
  unreachable
}
zdevito commented 8 years ago

If I understand the example correctly, you want a warning because the variable with type some_struct has not be defined before you deference one of its pointers? Terra doesn't do this kind of semantic check right now. It could be added but it requires another complete pass in the compiler, so it hasn't been implemented yet.

On Wed, Aug 17, 2016 at 4:14 PM, Konstantin A. Olkhovskiy < notifications@github.com> wrote:

Minimal reproducible test case:

local foo = macro(function() local struct some_struct { ptr: &int8 } some_struct.methods.test_arr = macro(function(self) return @self.ptr == 1 -- the following line works -- returnself.ptr == nil end)

    local impl = symbol(&some_struct)
    return quote
            var [impl]
    in
            impl
    endend)

terra traps() var f = foo() f:test_arr()end

traps:compile() traps:printpretty() traps:disas()

Result:

../sample.t:19: terra traps() : {} ../sample.t:20: var f : &some_struct = ../sample.t:13: let var $0 : &some_struct in

../sample.t:15: $0 ../sample.t:13: end ../sample.t:6: ../sample.t:21: int32 == ../sample.t:6: 1 ../sample.t:19: end definition {} -> {}

define void @"$traps"() #0 { entry: call void @llvm.trap() unreachable }

assembly for function at address 0x7ff50d32d000 0x7ff50d32d000(+0): push rbp 0x7ff50d32d001(+1): mov rbp, rsp 0x7ff50d32d004(+4): ud2

Is this an expected behavior? Terra code in the above example is not correct, but seeing trap in the resulting IR was quite surprising and took quite a while to debug down to uninitialized symbol. May be terra could catch such cases and report an error? Running terra -v reveals it's LLVM optimization actually due to UB.

define void @"$traps"() #0 { entry: %f = alloca %some_struct %"$0" = alloca %some_struct %0 = load %somestruct, %somestruct* %"$0" store %some_struct* %0, %some_struct* %f %1 = load %somestruct, %somestruct* %f %2 = getelementptr %some_struct, %somestruct* %1, i32 0, i32 0 %3 = load i8, i8_* %2 %4 = load i8, i8\ %3 %5 = sext i8 %4 to i32 %6 = icmp eq i32 %5, 1 %7 = zext i1 %6 to i8 ret void }

optimizing scc containing: $traps optimizing $traps

define void @"$traps"() #0 { entry: call void @llvm.trap() unreachable }

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/zdevito/terra/issues/205, or mute the thread https://github.com/notifications/unsubscribe-auth/AAWmGju30h-XuLA0R_VP8IpHNO0YnCBHks5qg5XfgaJpZM4Jm-2p .

Lupus commented 8 years ago

At the moment I wrote this issue I was not fully aware of what I want, so it was more of a subject for discussion. Static analysis at terra level would be great though.

Just now I've stepped on another UB, while debugging one of test cases I've added free() at the end, later I forgot about it and continued writing code below using the same pointer, passed to free(). LLVM found great optimizing opportunity and just omitted half of my test function due to UB. Right now there is no way to catch that and it might result in obscure bugs.

What can help here (and is probably easy to do) is provide a flag for terra to disable optimizations (some, or all, think -O0, -O1 etc.). This way LLVM won't exploit the UB and it will happily crash and Valgrind or ASAN can be used efficiently to diagnose the problem.

What do you think @zdevito?

Lupus commented 8 years ago

This is getting funnier. I forgot about in clause for return quote in my macro, and used that macro as S.printf() argument. No code in that function starting from that line will ever find it's way into the LLVM IR now...

Lupus commented 8 years ago

Sorry, last comment happened due to #208, it has nothing to do with UB.