Support of TBF for software system benchmarks

sosy-lab / tbf

A framework for using test case generators to locate errors in C programs

https://sosy-lab.org/research/test-study

Other

9 stars 7 forks source link

Support of TBF for software system benchmarks #6

Closed animeshbchowdhury closed 5 years ago

animeshbchowdhury commented 6 years ago

Hi,

I have been using greybox fuzzing approach (afl) for program falsification. However, I have been unable to apply it over software system sv-benchmarks. Even I have seen in your experimentation results, all test based approach tools are unable to handle such benchmarks.

My observation on software system benchmarks is, presence of __VERIFIER_nondet_pointer(), even in functions like kzmalloc (which expects memory allocation) and pointer arithmetic operations on high valued address. Using such software system codes as user process for dynamic analysis, is leading to segmentation faults, even on reachsafety codes.

My question is : Any way by which dynamic analysis can be used for software system benchmarks ?

lembergerth commented 6 years ago

Hi!

I don't expect that tbf will be able to handle the Systems-tasks of sv-benchmarks in the immediate future.

There are plans to handle non-deterministic pointers in programs, this is definitely possible, but the bigger problem is undefined behavior in many of these tasks, e.g., undefined structs or variables. I'm confident that we could synthesize code to tackle this issue, but this is currently not my main focus. (There was some talk about this in https://github.com/sosy-lab/tbf/issues/1#issuecomment-405832929)

If you want me to elaborate or if this didn't answer your question, feel free to ask some more!

animeshbchowdhury commented 6 years ago

Hi,

Thanks for the prompt response. I have gone through the comment you mentioned in Issue #1. I have created a synthesized code, for handling such undefined structs and variables (Assigning nondet values to such variables). For undefined functions, I am also returning nondets of return type. Currently, I have changed the nondet_pointer() to a parameterized function, and returns a chunk of dynamically allocated memory of size passed as parameter.

With this modified change, I tried to run tbf. I faced issue of segmentation fault with testcases, which were discovered as crashes by afl. Using gdb, I can see clearly that crash is happening from pointer derefs.

Are the submitted benchmarks in software system complete enough, for proving correctness or falsification ?

lembergerth commented 6 years ago

With this modified change, I tried to run tbf. I faced issue of segmentation fault with testcases, which were discovered as crashes by afl. Using gdb, I can see clearly that crash is happening from pointer derefs.

Was the segmentation fault during execution of test-cases, or a fault of TBF?

I'm not an expert regarding the sv-benchmarks, but the Systems tasks are from real-world systems that were tailored for formal verification. Errors easily sneak in there. We already know that in some of these undefined behavior exists and people continuously try to get rid of it, but I don't think this is finished yet.

If you were able to produce an executable test case for which a segmentation fault occurs, you actually found a proof that the corresponding benchmark task is incomplete. In that case, it would be awesome if you could create an issue in the sv-benchmarks repo with your harness and the input values.

lembergerth commented 6 years ago

Oh, and if you wrote code for handling such undefined variables and structs, it would of course also be awesome if we could integrate this into tbf, so that all users can benefit from it :-)

animeshbchowdhury commented 6 years ago

Hi,

With this modified change, I tried to run tbf. I faced issue of segmentation fault with testcases, which were discovered as crashes by afl. Using gdb, I can see clearly that crash is happening from pointer derefs.

Was the segmentation fault during execution of test-cases, or a fault of TBF?

No, it was not at all a fault of TBF. The executable was created successfully. However, I did some changes as I mentioned, for nondet pointers , undefined variables, structs and functions. I took the harness file and the original code, changed it, and then run the afl. The crash generated by afl is not reaching VERIFIER error and caused a segmentation fault.

If you were able to produce an executable test case for which a segmentation fault occurs, you actually found a proof that the corresponding benchmark task is incomplete. In that case, it would be awesome if you could create an issue in the sv-benchmarks repo with your harness and the input values.

I will surely do this.

animeshbchowdhury commented 6 years ago

Oh, and if you wrote code for handling such undefined variables and structs, it would of course also be awesome if we could integrate this into tbf, so that all users can benefit from it :-)

I did it manually for the time being, to check. I am in process of making this automated.