Open hrshy0629 opened 2 years ago
Slicing seems to be interprocedure. Can you set a parameter to control whether the output is the slice bitcode of the target function or the slice set bitcode of the interprocedure function?
There is the option -entry
to set the entry point of slicing (but it is still interprocedural). Most of the analyses should be able to work also intraproceduraly, but there is no switch to turn this on at this moment.
Can you select more accurate variable location information? LLVM debug information contains row information and column information. Can you select variables by determining row and column information?
You can select based on row, but not column at this moment. But it should be an easy hack at this place: https://github.com/mchalupa/dg/blob/master/tools/llvm-slicer-crit.cpp#L213 If you do not want to hack the code, you can modify the bitcode and insert a call of a dummy function at the instruction that you want to use as the slicing criterion.
@mchalupa Thanks for your comment. In response to my first question, I would like to modify the code to get specify the coded slice of a single function. Interprocedural information is useful for debugging, but I just want to cut out my own function information. Could you tell me the general file information in DG? For example, which files involve function slicing?
If you need to only see the result, you can try -annotate slice
with llvm-slicer
. It will produce a commented .ll file. Also, llvm-dg-dump
has the switch -func
that makes it to dump only a given function.
Slicing of LLVM DG is in this file: https://github.com/mchalupa/dg/blob/master/include/dg/llvm/LLVMSlicer.h but if you want to only dump the information, you probably want to just take the built dependence graph and dump some information about it. For that, you can find inspiration here: https://github.com/mchalupa/dg/blob/master/tools/llvm-dg-dump.cpp
In particular, you will want to get your function from this map: https://github.com/mchalupa/dg/blob/master/include/dg/llvm/LLVMDependenceGraph.h#L205 and then just iterate over the dependence graph and dump information about it.
@mchalupa I am not sure it is a bug. When I use llvm-slicer -sc LINE#err -entry FUNCNAME .bc, I find that it is good when the LINE is [132, 148], but it reports "Segmentation fault (core dumped)" when the LINE is [143, 145], which means that we do not choose the VAR that is in if/call_statement? Are there any other restrictions when using DG? `static int jread(struct buffer_head bhp, journal_t journal, unsigned int offset) { 132# int err; ... 143# err = jbd2_journal_bmap(journal, offset, &blocknr);
145# if (err) { printk(KERN_ERR "JBD2: bad block at offset %u\n", offset); 148# return err; }`
@mchalupa Hi, I find a bug. When the function instMatchesCrit(tools/llvm-slicer-crit.cpp) is matching instruction, it ignore the PHI instruction. The line in debug info of PHI is !6084 = !DILocation(line: 0, scope: !6073)
, you will get the line which is 0, which is error. The more reliable method is as follows:
`
cleanup:
%retval.0 = phi i32 [ -117, %if.then ], [ %call1, %if.then2 ], [ 0, %if.end22 ], [ -5, %if.then20 ], [ -12, %if.end4.cleanup_crit_edge ], !dbg !6084
!6084 = !DILocation(**line: 0**, scope: !6073)
%if.then:
br label %cleanup, !dbg !6094
!6094 = !DILocation(line: 140, column: 3, scope: !6093)
%if.then2:
br label %cleanup, !dbg !6102
!6102 = !DILocation(line: 148, column: 3, scope: !6101)
%if.end22
br label %cleanup, !dbg !6144
!6144 = !DILocation(line: 171, column: 2, scope: !6073)
%if.then20
br label %cleanup, !dbg !6142
!6142 = !DILocation(line: 167, column: 3, scope: !6140)
if.end4.cleanup_crit_edge
br label %cleanup, !dbg !6109
!6109 = !DILocation(line: 152, column: 6, scope: !6073)
`
And the true line is as follows: `
` I think we can fix this bug.
Thanks for looking into that. If you know how to fix this, can you prepare a pull request?
I need help! I debug the function instMatchesCrit. But I do not know why the function returns true when obj.empty() is true? If the obj is not NONE and we pass the line check, should we return true?
If obj.empty()
is true, then obj is NONE. The rest of the code assumes, that there is some object we want to match and if there is no object that we want to specifically match, the instruction satisfies the criterion (it matches line and function). Or I do not understand the problem?
The overview of DG is: 1)match the input objs to Inst, but dg required the Insts of the objs can read or write the memory; 2) the class of LLVMSlicer initializes the different graphs based on the input bc file; 3) DG queries the Insts from the different graphs and output the results of slices. But the Insts which can not read or write the memory will be discarded, which will result in inaccurate results. Such as the example. The Inst br label %cleanup, !dbg !6102 has little to do with anything else, but it was part of our requirements.
%if.then2:
br label %cleanup, !dbg !6102
!6102 = !DILocation(line: 148, column: 3, scope: !6101)
Do I understand this correctly?
If an instruction is sliced away, it means none of slicing criterions depend on it, but it doesn't matter if it accesses memory or not. It is true, though, that matching slicing criteria is restricted to instructions that access memory or to function calls. But you can hack this part rather easily, I would say. Also, there is a switch 'slicing-criteria-are-next-inst' that makes the slicing criteria be the successors of the matched instructions. So what you can do is to insert calls of a dummy function before any instruction you want and use this switch to mark it as a slicing criterion.
Hi, I would like to ask whether the fault tolerance rate of DG is relatively small. I have tried many cases and encountered the problem of segment error.
` ../../llvm-slicer --sc mlxreg_fan_set_cur_state#379#err --cutoff-diverging=1 --cda ntscd --forward=1 --entry mlxreg_fan_set_cur_state 000cc5bc49aafc83a131bab2becf9922f5eec658_drivers@hwmon@mlxreg-fan.o.bc SC: Matched 'mlxreg_fan_set_cur_state#379#err' to: call void @asan_load8_noabort(i64 %34), !dbg !4009 %35 = load %struct.regmap*, %struct.regmap** %33, align 8, !dbg !4009 call void @asan_load4_noabort(i64 %37), !dbg !4010 %38 = load i32, i32 %36, align 4, !dbg !4010 %call95 = call i32 @regmap_write(%struct.regmap %35, i32 %38, i32 %div93.zext) #12, !dbg !4013 [llvm-slicer] cutoff 5 diverging blocks and 193 completely removed WARNING: matched due to a lack of information: %33 = load %struct.regmap*, %struct.regmap* %31, align 8, !dbg !4005 WARNING: matched due to a lack of information: %36 = load i32, i32 %34, align 4, !dbg !4006 WARNING: matched due to a lack of information: %call95 = call i32 @regmap_write(%struct.regmap %33, i32 %36, i32 %div93.zext) #13, !dbg !4009 SC: Matched 'mlxreg_fan_set_cur_state#379#err' to: %33 = load %struct.regmap, %struct.regmap* %31, align 8, !dbg !4005 %36 = load i32, i32 %34, align 4, !dbg !4006 %call95 = call i32 @regmap_write(%struct.regmap* %33, i32 %36, i32 %div93.zext) #13, !dbg !4009
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Segmentation fault (core dumped)
`
Is there any way to avoid or solve this problem?
Hi, I would like to ask whether the fault tolerance rate of DG is relatively small.
What do you mean by fault tolerance?
Is there any way to avoid or solve this problem?
What happens if you do not use cutoff-diverging
? This option is meant for backward slicing (it is not explicitly written in the help message, but its nature kind of implies it) and I see you use forward=1
.
I read and implemented DG, but I found that there were still several issues that failed to reach the target. 1) Slicing seems to be interprocedure. Can you set a parameter to control whether the output is the slice bitcode of the target function or the slice set bitcode of the interprocedure function? 2) Can you select more accurate variable location information? LLVM debug information contains row information and column information. Can you select variables by determining row and column information? Thank you for any suggestions.