mchalupa / dg

[LLVM Static Slicer] Various program analyses, construction of dependence graphs and program slicing of LLVM bitcode.
MIT License
474 stars 131 forks source link

It there any way to preserve instructions depends on or depended by sliced criterions? #456

Open XChy opened 6 months ago

XChy commented 6 months ago

@mchalupa, hi!. I'm working on reducing two semantically equivalent IRs, which share similar syntatical structure. Now I want to slice based those different instructions, and cut off anything irrelevant to them. For example:

int foo(int a, int* b) {
  int ret = other_func(a);
  *b = 666;
  ret += 1;
  return ret;
}

I slice it based on other_func(ret1), and want store of b to be sliced:

int foo(int a, int* b) {
  int ret = other_func(a);
  ret += 1;
  return ret;
}

But actually we only get a single other_func(a):

int foo(int a, int* b) {
  other_func(a);
  return undef;
}

So it's there any way to realize it ?

Apart from this, it's there any easy way to hack dg so that any type of instructions (not only call-sites/ret) can be the slicing criterions?

mchalupa commented 6 months ago

Hi,

I slice it based on other_func(ret1), and want store of b to be sliced:

I'm not sure I understand. other_func(ret1) is not part of the code, what do you mean exactly by this slicing criterion? If you slice the code w.r.t calls to other_func then the slice you posted looks correct.

Apart from this, it's there any easy way to hack dg so that any type of instructions (not only call-sites/ret) can be the slicing criterions?

Any instruction can be a slicing criterion: https://github.com/mchalupa/dg/blob/master/doc/llvm-slicer.md#slicing-criteria-the--sc-option

mchalupa commented 6 months ago

I slice it based on other_func(ret1), and want store of b to be sliced

Oh, you want forward slicing?

XChy commented 6 months ago

I may not clarify it well, what I really want is to control the "strength" of forward/backward slicing. Take a llvm-ir as example (because I work only on llvm-ir, instead of C now)

declare i32 @dummy(i32)
define i32 @foo(i32 %a, ptr %b) {
entry:
  %c = sub i32 0, %a 
  %call = call i32 @dummy(i32 %c)
  store i32 666, ptr %b, align 4
  %add = add i32 %call, 1
  ret i32 %add
}

Firstly, we take dummy as the slicing criterion, and then the store of b doesn't influence dummy and isn't influenced by dummy. So store b should be eliminated unconditionally.

%add depends on dummy, so it should not be eliminated unconditionally for me. But without -forward option, dg produces:

define i32 @foo(i32 %a, ptr %b) {
entry:
  %c = sub i32 0, %a
  %call = call i32 @dummy(i32 %c)
  br label %safe_return

safe_return:                                      ; preds = %entry
  ret i32 undef
}

And thanks for your reminder, with -forward as you say, it produces what I want now:

define i32 @foo(i32 %a, ptr %b) {
entry:
  %c = sub i32 0, %a
  %call = call i32 @dummy(i32 %c)
  %add = add i32 %call, 1
  ret i32 %add
}

Further, I wonder if it's possible to control the "strength" or "degree" of both forward and backward slicing. Sometimes I want %c to be replaced with an arbitrary argument, and sometimes I want %add to be eliminated or even ret i32 %add to be replaced with ret i32 %call. In this way we can control the complexity and specificity of a certain sliced IR. If not, could you provide an appropriate place to hack it?

XChy commented 6 months ago

Any instruction can be a slicing criterion: https://github.com/mchalupa/dg/blob/master/doc/llvm-slicer.md#slicing-criteria-the--sc-option

It seems to require emitting debug instructions, but now I only have IR. I prefer to taking the name of Value/Instruction as a type of slicing criterion. If there is not such option, where can I hack?

mchalupa commented 6 months ago

Further, I wonder if it's possible to control the "strength" or "degree" of both forward and backward slicing. Sometimes I want %c to be replaced with an arbitrary argument, and sometimes I want %add to be eliminated or even ret i32 %add to be replaced with ret i32 %call. In this way we can control the complexity and specificity of a certain sliced IR. If not, could you provide an appropriate place to hack it?

What you describe is not slicing, so no, the slicer does not support this. I would suggest you just create a pass/new tool that uses DG to do this. I would start with llvm-slicer, after this point in the code: https://github.com/mchalupa/dg/blob/master/tools/llvm-slicer.cpp#L267C22-L267C22, you can go through the dependence graph and unmark nodes that should be in the slice but you want to preserve them. In this step you can also remember instructions that you want to replace and after slicing itself (https://github.com/mchalupa/dg/blob/master/tools/llvm-slicer.cpp#L285), you can replace them with different instructions.

It seems to require emitting debug instructions, but now I only have IR. I prefer to taking the name of Value/Instruction as a type of slicing criterion. If there is not such option, where can I hack?

Instructions in LLVM do not have names by default, so this is not reliable and not implemented, but you can hack it here: https://github.com/mchalupa/dg/blob/master/tools/llvm-slicer-crit.cpp#L1113

XChy commented 6 months ago

Thanks for your explanation and assistance. That's really helpful to me, your guide on where to hack it is highly appreciated!