prosyslab-classroom / is593-language-based-security

28 stars 6 forks source link

[Homework5] segmentation fault causes example1.ll to become empty #34

Closed KunJeong closed 4 years ago

KunJeong commented 4 years ago

Hi. After my first attempt at the implementation, I tested my code using ../debloater ./example1.sh example1.ll, and it somehow resulted in a segmentation fault, probably because of some mistake in my code. On the next call of ../debloater ./example1.sh example1.ll, however, it returned The test script initially failed and upon checking I found that example1.ll had only the first two lines left. Since example1.ll is there but just wrong, and example1.c contains no changes make does not fix up example1.ll, causing me to delete and rerun make.

My guess is that the segmentation fault is the cause, but it's very difficult to cd .. make cd test rm -rf example1.ll make ../debloater ./example1.sh example.ll every single time in order to debug the code. Is there any way to fix this behavior? Thank you.

hyunsooda commented 4 years ago

I don't know that there is an intelligent work for that but I used following commands.

Before doing following commands, you first should change the *.sh files [from] : clang -o ${NAME} ${NAME}.ll &>/dev/null || exit 1 [to] : clang -o ${NAME} test/${NAME}.ll &>/dev/null || exit 1

After pass all the unit tests, you should revert *.sh files and then perform make test.

h2oche commented 4 years ago

I experienced same issue while doing homework and after spending some hours, I guess I found cause of this issue. When there is undefined function call, Llvm.print_module emits segmentation fault. For example, if original *.ll file looks like below

define i32 @f1() #0 {
entry:
  ret i32 0
}

define i32 @main() #0 {
entry:
  %call = call i32 @f1()
  ret i32 %call
}

and delete function declaration f1,

define i32 @main() #0 {
entry:
  %call = call i32 @f1()
  ret i32 %call
}

it results in segmentation fault in Llvm.print_module.

I don't know whether considering this is also part of this homework. However, it will be much more helpful to us if professor or TA upgrade check function in debloat.ml to handle this issue. Or is there any way to handle segmentation fault in ocaml? I tried try - with approach, but it doesn't seem to work...

KihongHeo commented 4 years ago

@h2oche Hi. How do you delete function f1 in your example? With the provided API, I cannot reproduce your error. It works fine with my implementation. It would be nice if you push your code to your repo. I can also check.

The segmentation fault happens in LLVM's C++ code. So it is not an exception. OCaml cannot catch this.

KihongHeo commented 4 years ago

@KunJeong There exists example1.origin.ll? You can copy the file to example1.ll. Again, can you push your code also? Let me check why this happens.

KunJeong commented 4 years ago

I just pushed the code. For me, segmentation fault occurs around 20% of the time, even though the executable is the same (I didn't run make) which is very weird.

KihongHeo commented 4 years ago

@KunJeong Good. Let me check. How can I reproduce? Any command? When I tried your code,

$ ../debloater ./example1.sh example1.ll
Iteration 1, # Instrs = 10
-- running dd_global
-- running dd_block
-- running dd_instr
-- tree_reduction
Iteration 2, # Instrs = 0
-- running dd_global
-- running dd_block
-- running dd_instr
-- tree_reduction

It successfully terminates without seg fault (even though the result is incorrect for now)

KunJeong commented 4 years ago

Yes, it does that 80% of the time and crashes 20% of the time. Repeatedly running rm example1.ll && make && ../debloater ./example1.sh example1.ll in the test directory should reproduce the error. I get fault in about 5 times.

KihongHeo commented 4 years ago

@KunJeong I cannot reproduce the seg fault. It works fine in my machine.

I tried your command more than 20 times and no crashes... so far.

KunJeong commented 4 years ago

@KihongHeo Hmm.. that is unusual. Actually, I didn't experience a seg fault in that version too, but it happened after adding some debug code. Now I've reverted to that version with git stash but the crash is still there. Is there anything else that I can reset? Should I try re-cloning?

KihongHeo commented 4 years ago

Yes. That makes sense, because some LLVM functions do not allow invalid LLVM code as input. If you can push your code with debug code (or email to me), I can help. And I can announce at least some guidelines to avoid such things to other students also.

KunJeong commented 4 years ago

@KihongHeo Update: I re-cloned into another folder, ran make in both directories and ran ../debloater ./example1.sh example1.ll and I got the seg fault on the third try. I am very confused... About the debug code: I deleted the version that first caused the error and changed my code to find the seg fault, so I may not be able to reproduce it.

KihongHeo commented 4 years ago

By re-clone you mean the master head of your repo? commit: 030561140ae39bfbcee90242368db1afc7284f69 ?

KunJeong commented 4 years ago

Yes. commit 030561140ae39bfbcee90242368db1afc7284f69. Could there be any machine settings causing the issue?

KihongHeo commented 4 years ago

It is very weird. LLVM must not be non-deterministic.. What is the output of the following command?

llvm-config --version

Can you put print here and there and localize when the seg fault happens exactly in your code? Also, it would be nice if you can send me the output .ll file at the crash point.

KihongHeo commented 4 years ago

Do you run multiple debloater in parallel? As you might notice, the check function writes each candidate program to the same file continuously (e.g., example1.ll), so if you launch multiple processes of debloater for the same input, it may have a race condition.

KunJeong commented 4 years ago
  1. llvm-config --version : 9.0.1
  2. If by output you mean debloat.ll, here is the result which has everything deleted since that's what my code does currently.
    
    ; ModuleID = 'example1.ll'
    source_filename = "example1.c"
    target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-apple-macosx10.15.0"

!llvm.module.flags = !{!0, !1} !llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4} !1 = !{i32 7, !"PIC Level", i32 2} !2 = !{!"clang version 9.0.1 "}


3. I'm working on localizing. I guess I could just find the source and fix my code if the issue @h2oche mentioned:
> When there is undefined function call, Llvm.print_module emits segmentation fault.

is not the case. But if it is the case I'm not sure I can completely fix it.
4. As far as I know, I'm not. I wait until the `debloater` terminates (the terminal shows cursor) and run the command again. However, I'm not familiar with the details of zsh, so I might be wrong.
KihongHeo commented 4 years ago

*.debloat.ll will be generated only when the debloater terminates successfully. See line 73 and 74 in main.ml.

I mean example1.ll right after segmentation fault.

KunJeong commented 4 years ago

I've pushed the version of my code with a lot more print statements and some fixes which still exhibits the same behavior (seg fault after 3~5 times) Here is the example1.ll file after crash. It is a little different as the seg fault occurs after first reduction is applied.

; ModuleID = 'example1.ll'
source_filename = "example1.c"
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.15.0"

; Function Attrs: noinline nounwind ssp uwtable
define i32 @f1() #0 {
entry:
  ret i32 0
}

; Function Attrs: noinline nounwind ssp uwtable
define i32 @f2() #0 {
entry:
  ret i32 1
}

; Function Attrs: noinline nounwind ssp uwtable
define i32 @f3() #0 {
entry:
  ret i32 1
}

; Function Attrs: noinline nounwind ssp uwtable
define i32 @f4() #0 {
entry:
  ret i32 1
}

attributes #0 = { noinline nounwind ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"PIC Level", i32 2}
!2 = !{!"clang version 9.0.1 "}
KihongHeo commented 4 years ago

The result looks ok. Still, I cannot reproduce your error after many trials with the new version of your code. If you find more clues, let me know. I can help more.

h2oche commented 4 years ago

@KihongHeo

I pushed my commit with codes what I described in above discussion. I intentionally omit some code in main.ml to reveal my situation more clearly. Can you check once? Situation I described could be reproduced by following usual instructions in my laptop. Thanks!

KihongHeo commented 4 years ago

@h2oche I still cannot reproduce your error. Your code works well in my machine.

@h2oche @KunJeong Which OS do you folks use?

KunJeong commented 4 years ago

@KihongHeo I've pushed the current version, which (thankfully) faults deterministically. If you run example2 and see the output, deleting the %tobool1 instruction results in a <badref> in the next branch.

*Summary of what happens:

%tobool1 = icmp ne i32 0, 0
br i1 %tobool1, label %if.then2, label %if.else3
br label %if.end4
br label %if.end4
%a.0 = phi i32 [ 10, %if.then2 ], [ 0, %if.else3 ]
ret i32 %a.0

delete 1th element

br i1 <badref>, label %if.then2, label %if.else3
br label %if.end4
br label %if.end4
%a.0 = phi i32 [ 10, %if.then2 ], [ 0, %if.else3 ]
ret i32 %a.0

This crashes the check function. Sorry for repeatedly updating the comment. I'm still actively debugging. I use MacOS, because I couldn't figure out how to use Docker..sorry..

KihongHeo commented 4 years ago

@KunJeong Your code successfully run with example2.ll in my machine. The output is the same as example2.expected.ll.

One thing I want to mention is that, if you remove all functions (so, an empty ll file), it may crash. So, make sure that you don't delete all functions.

I will try with MacOS soon.

KihongHeo commented 4 years ago

For a while, can you try your code on the provided virtual machine?

KihongHeo commented 4 years ago

@KunJeong @h2oche I could reproduce your seg faults on my Mac. It is weird they have different behaviors.

Can you work on your homework in the provided Docker or Ubuntu VM (KAIST Cloud) by TAs before? Using Linux, we have tested all evaluation test sets on 3 different machines (mine, TA's, travis), and all work well.

KunJeong commented 4 years ago

Thanks for checking. It works well on the provided Ubuntu VM. I guess seg-fault from crashes the check function in Mac but not on Ubuntu.

h2oche commented 4 years ago

@KihongHeo Thanks for checking. I'll try docker. Thanks!