sosy-lab / sv-benchmarks

Collection of Verification Tasks (MOVED, please follow the link)
https://gitlab.com/sosy-lab/benchmarking/sv-benchmarks
184 stars 169 forks source link

Undefined behavior in tasks due to division by zero #504

Open lembergerth opened 6 years ago

lembergerth commented 6 years ago

The ESBMC team (@mikhailramalho) was able to create the attached list of tasks that may contain a division by zero.

To my knowledge, the following four tasks are already confirmed to contain a division by zero: loop-industry-pattern/ofuf_1_true-unreach-call.c loop-industry-pattern/ofuf_2_true-unreach-call.c loop-industry-pattern/ofuf_3_true-unreach-call.c loop-industry-pattern/ofuf_4_true-unreach-call.c

The other tasks still have to be confirmed.

@dbeyer proposed to solve these issues by inserting before a line x/y a call avoid_zero(y) that is implemented by body {if (!y) {abort();}} .

Since there is only little time left to fix these tasks for SVCOMP'18, this Issue aims to coordinate the work on these tasks.

If you are willing to fix any of the tasks listed in the attached list (the proposed solution can also be used if division by zero is not confirmed), please comment on this issue with the list of tasks that you will try to fix. Taking a block of tasks instead of cherry picking would be the easiest to coordinate.

List of tasks that must be fixed: log.txt

danieldietsch commented 6 years ago

I will try to confirm the issues for all items with Ultimate Automizer and then see about fixing them.

mutilin commented 6 years ago

Could somebody point the exact places where division by zero occurs and the corresponding zero assignment?

mikhailramalho commented 6 years ago

Hi all,

Thank you for opening the issue.

I should've send the counterexamples for all tasks in my previous emails, sorry about that.

The lines are:

Violated property: file ofuf_1_true-unreach-call.c line 283 function Id_MCDC_110 division by zero Id_MCDC_136 != 0

Violated property: file ofuf_2_true-unreach-call.c line 258 function Id_MCDC_92 division by zero Id_MCDC_118 != 0

Violated property: file ofuf_3_true-unreach-call.c line 258 function Id_MCDC_92 division by zero Id_MCDC_118 != 0

Violated property: file ofuf_4_true-unreach-call.c line 258 function Id_MCDC_92 division by zero Id_MCDC_118 != 0

2017-11-15 9:57 GMT+00:00 mutilin notifications@github.com:

Could somebody point the exact places where division by zero occurs and the corresponding zero assignment?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sosy-lab/sv-benchmarks/issues/504#issuecomment-344543011, or mute the thread https://github.com/notifications/unsubscribe-auth/ABREnzRMKtgKJBSuQG2pfjV1Bc2jgTZWks5s2rWVgaJpZM4QejOW .

--

Mikhail Ramalho.

mutilin commented 6 years ago

@mikhailramalho What about ldv-* benchmarks which appeared in the log.txt?

mikhailramalho commented 6 years ago

Hi,

I just created a folder in Dropbox with all the counterexamples generated by our tool.

https://www.dropbox.com/sh/q202xpwogyxafeb/AABqeLwXzE1xUrEtbxtq1H8sa?dl=0

It contains only the first and last 32kb of everything esbmc prints, but it's enough to show all the counterexamples. The last bit of each file will show the line number, the expression that triggered the violation and the bound.

Let me know if you need more information. I'm currently trying to validate our findings in the reck* benchmarks with CBMC.

2017-11-15 12:18 GMT+00:00 mutilin notifications@github.com:

@mikhailramalho https://github.com/mikhailramalho What about ldv-* benchmarks which appeared in the log.txt?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sosy-lab/sv-benchmarks/issues/504#issuecomment-344575698, or mute the thread https://github.com/notifications/unsubscribe-auth/ABREn-9lSwwe3iEqZc3kemRuzNengZeqks5s2taEgaJpZM4QejOW .

--

Mikhail Ramalho.

mutilin commented 6 years ago

@dbeyer proposed to solve these issues by inserting before a line x/y a call avoid_zero(y) that is implemented by body {if (!y) {abort();}} .

I like the solution. It is consistent with intention of the 'ldv-*' benchmarks which are intended for reach safety, all other violations should be ignored

lucasccordeiro commented 6 years ago

I also agree with this solution. Do we have already a PR that implements this solution?

danieldietsch commented 6 years ago

Our tool does not help with a lot of the examples from the list (either resource problems or unsupported C features).

We can confirm a division by zero for

Our tool says that

I can have a look at the files containing a div-by-zero above and try to provide a fix based on Dirk's avoid_zero(y) function.

mutilin commented 6 years ago

Do we have already a PR that implements this solution?

I have not seen it, please provide, I can have a look at fixes

lucasccordeiro commented 6 years ago

Ok, let's break this task into categories.

@danieldietsch: May I ask you to take care about the ldv-* benchmarks? @leostrakosch: May I ask you to take care about the loop-* benchmarks?

I'll make a PR related to the seq-mthreaded benchmarks.

danieldietsch commented 6 years ago

I can check the benchmarks our tool supports (see above), but I think there are way more ldv-* benchmarks (85 of the 120).

mutilin commented 6 years ago

I just noticed that abort() function is not described in SV-COMP rules, so I think it is better to use __VERIFIER_assume(y!=0)

danieldietsch commented 6 years ago

I second that; we already had a lot of discussions about the difference between if(y==0) abort() and __VERIFIER_assume(y!=0)

tautschnig commented 6 years ago

I'd say the key difference is whether you want the benchmark to be meaningful for termination analysis?!

PhilippWendler commented 6 years ago

Hm. If __VERIFIER_assume is a problem for termination, we should probably redefine the semantics of __VERIFIER_assume or adjust the definition of the termination property. Otherwise we would end up removing __VERIFIER_assume completely from all benchmarks, which doesn't make sense.

danieldietsch commented 6 years ago

I think abort() would be a problem; __VERIFIER_assume() is fine.

tautschnig commented 6 years ago

@danieldietsch Would you mind elaborating? To what extent is abort() a problem, and how is __VERIFIER_assume() fine with regard to termination?

@PhilippWendler I'm not sure I agree with the removal of __VERIFIER_assume being as bad: the more we enshrine in SV-COMP rules, the harder participation in the competition becomes (and we end up building competition candidates rather than practically useful tools). abort is a C library function, its documentation is taken care of by others -- and verification tools anyway need to support it.

danieldietsch commented 6 years ago

@tautschnig Mainly because we currently do not support abort() but VERIFIER_assume(), we do not have to think about it just yet. We model __VERIFIER_assume() as assume, i.e., if the assume is violated, the program execution does not exist. An abort() has a terminating program execution. I am not yet clear as to whether there is actually a difference.

tautschnig commented 6 years ago

In my opinion, the sole difference lies in termination. Quoting the rules:

__VERIFIER_assume(expression): A verification tool can assume that a function call __VERIFIER_assume(expression) has the following meaning: If 'expression' is evaluated to '0', then the function loops forever, otherwise the function returns (no side effects). The verification tool can assume the following implementation: void __VERIFIER_assume(int expression) { if (!expression) { LOOP: goto LOOP; }; return; }

The C standard defines abort (7.22.4.1 in C11):

The abort function causes abnormal program termination to occur, unless the signal SIGABRT is being caught and the signal handler does not return. Whether open streams with unwritten buffered data are flushed, open streams are closed, or temporary files are removed is implementation-defined. An implementation-defined form of the status unsuccessful termination is returned to the host environment by means of the function call raise(SIGABRT).

mdangl commented 6 years ago

I would guess that the definition of __VERIFIER_assume from the competition is a relict of times where termination was of no concern to the participants, and in my opinion, it should actually be replaced with a definition using abort(). Do submitters of verification tasks who use "__VERIFIER_assume" to restrict variable values actually really say "yes, I want to add nontermination to my task!"? I cannot imagine it.

danieldietsch commented 6 years ago

@mdangl @tautschnig I totally agree with Matthias and I think we labeled all the tasks according to the understanding that an assume does not cause non-termination. I am not sure if there are benchmarks that rely on either one of the assume interpretations.

lucasccordeiro commented 6 years ago

I started adding assume statements to avoid undefined behavior in the seq-mthreaded benchmarks: PR https://github.com/sosy-lab/sv-benchmarks/pull/517

@danieldietsch and @leostrakosch: May I ask you to update this issue as soon as you create the respective PRs to fix the ldv-* and loop-* categories?

PhilippWendler commented 6 years ago

Btw, there is another possible difference between __VERIFIER_assume and abort: abort could cause violations of valid-memcleanup, __VERIFIER_assume (if interpreted as an infinite loop) not.

@dbeyer So there should really be an official answer of the following two questions:

Maybe this even needs a jury vote and a broader announcement?

I would argue that the answer to both questions should be "no". This would probably match the intention of most people and existing tasks (as indicated by @danieldietsch for termination), and it would also be practical: If we have a situation where we actually want to force non-termination or invalid memcleanup, there are still other ways to do so (infinite loop and abort, respectively). But there is no other way to specify at an arbitrary point in the program "I want to restrict the state space but not cause any other property-related effects.". It seems that @tautschnig should agree with this (following from his comment (https://github.com/sosy-lab/sv-benchmarks/issues/504#issuecomment-344910796)): If the answer to any of the above questions is not "no", then __VERIFIER_assume could be easily replaced with some standard C code and should be avoided completely. This would also mean that we do not need to replace __VERIFIER_assume with return 0 or abort anywhere.

PhilippWendler commented 6 years ago

I want to note that at least some people in the past had a understanding that is different from https://github.com/sosy-lab/sv-benchmarks/issues/504#issuecomment-344938369 and assumed that __VERIFIER_assume causes non-termination: #348, #352. While the tasks changed in #352 are not a problem (__VERIFIER_assume was removed there), it could be that people with the same understanding of __VERIFIER_assume have contributed tasks with labels that don't match the assumption from https://github.com/sosy-lab/sv-benchmarks/issues/504#issuecomment-344938369.

tautschnig commented 6 years ago

@PhilippWendler If the answer to both your questions is "no", what is then the semantics? If __VERIFIER_assume(false) would cause termination, then the answer to the second question can't be "no", because it would contradict memcleanup: "All allocated memory is deallocated before the program terminates." So it's neither non-terminating, but also not terminating?

mutilin commented 6 years ago

@tautschnig I would say that the nonterminating path defined by negating condition in __VERIFIER_assume is ignored for violation of any property, including nontermination property. So nontermination property is not violated, by definition of __VERIFIER_assume.

PhilippWendler commented 6 years ago

The semantics would be something special that could only be expressed with __VERIFIER_assume, thus giving a reason for the existence of __VERIFIER_assume. You can think of defining it as "clean up all allocated memory and terminate the program", or adjusting the definition of memcleanup to exclude __VERIFIER_assume cases. @mutilin's wording also sounds nice. (I think the rules should be extended with a clarification anyway even if the above questions are answered differently, because I would previously have assumed that the answers to my questions were "no".)

danieldietsch commented 6 years ago

@tautschnig : Exactly! Its neither terminating nor non-terminating because there is no program execution.

jhensel commented 6 years ago

We're fine with either definition. Termination seems more intuitive to me -- I did not expect non-termination when I checked its definition in the rules. I don't think we ever contributed tasks that contain an assume.

But we should make sure to have the rules consistent with the labelling.

tautschnig commented 6 years ago

Let me try to work out an example:

1: int main() {
2:  __VERIFIER_assume(0);
3:  return 0;
4: }

Let me try to define a program execution as a sequence of states, where a state is a program location and a mapping of memory locations to values. A program is terminating, if there exists a finite program execution. A program is non-terminating, if all program executions are infinite.

This program, under the current rules (which I see in no way ambiguous in this regard), has one unbounded program execution: (1,{}), (2,{}), (2, {}), ...

I will disregard the "there is no program execution" comment; the other suggestions appear to amount to a finite program execution (1,{}), (2,{}), which is essentially the same as replacing __VERIFIER_assume(cond); by if(!cond) abort();. The only difference for sequential programs are rules for memory cleanup (the practical value of which is a different story).

Let's look at a concurrent version:

1: int x;
2:
3: void t1() {
4:   __VERIFIER_assume(x);
5: }
6: 
7: void t2() {
8:   x=1;
9: }

With the assumption that t1 and t2 are concurrently executing threads, this program always terminates under the assumption of a fair scheduler. Under current rules, ... (4,{x->0}) is a possible prefix, but never a complete program execution. Both the use of abort as well as my reading of the above suggestions would actually add such a sequence (ending in this particular state) to the set of possible program executions.

peterschrammel commented 6 years ago

I don't think we ever contributed tasks that contain an assume.

When fixing undefined behaviour in termination benchmarks we avoided using __VERIFIER_assume exactly because its definition implies nontermination.

Its neither terminating nor non-terminating because there is no program execution.

Yes!

this program always terminates under the assumption of a fair scheduler.

There is no such assumption in SV-COMP, afaik.

danieldietsch commented 6 years ago

@tautschnig : Why do you want to disregard the "no program execution" comment?

danieldietsch commented 6 years ago

Perhaps the real question is: does the verdict "non-terminating" guarantee that there is one non-terminating execution or that there is no terminating execution?

The rules say

Every path finally reaches the end of the program. The proposition "end" is true at the end of every finite program execution (exit, abort, return from the initial call of main, etc.). A counterexample for this property is an infinite program execution.

So, in my world your main() function is terminating because proposition "end" is true at the end of every finite program execution, simply because there is no program execution (finite or infinite).

jhensel commented 6 years ago

My understanding is that the verdict means there is at least one non-terminating execution. I think there was some (inofficial?) agreement to exclude, e.g., programs that are not memory safe from the termination category because of exactly this issue.

tautschnig commented 6 years ago

I'm a bit puzzled by some of the comments on this thread. Could we please not forget our background in formal methods? In this spirit, we'll first need to agree on what a "program execution" is, before we can talk about the set of program executions, and then establish whether it is possible for that set to be empty ("there is no program execution").

My intuition is that any program that is syntactically valid will also have a non-empty set of program executions. Furthermore I am assuming it is possible to devise a syntactically valid program that includes __VERIFIER_assume(cond); where cond is always false (otherwise any use of __VERIFIER_assume must be syntactically invalid, because deciding that cond is always false is undecidable and thus in particular not a syntactic property).

@jhensel: memcleanup is not a memory safety problem.

jhensel commented 6 years ago

@tautschnig Sure, that's not what I said. But there we had the same problem: We can neither prove termination nor non-termination.

I was just referring to Daniels comment and did/do not want to make any suggestion regarding assume.

danieldietsch commented 6 years ago

@tautschnig But I am not forgetting this background. My intuition is just completely different. Basically, __VERIFIER_* statements are just for modelling things we do not want to express in code, e.g., environment assumptions. For me, they are not part of the program and should be only interpreted by the analysis tool. That we have some kind of implementation for them in C is just a relict from a time where we tried to express everything in C (also an issue with the witness format).

Mixing syntax and semantics seems like a bad idea to me.

As you, I see a program as a set of program executions, a program execution as a sequence of program states, and a program state as a mapping from variables to values. I would evaluate expressions with some interpretation function I and define the sequences of program states with some successor relation that relates program states and that is parameterized in the statements of the language. The successor relation for an assume(cond) is then something like {(s,s') | I(cond)(s)=true and s = s'}. If a state and a statement has no successor in the relation, then there is no program execution.

danieldietsch commented 6 years ago

I forgot: this is just my intuition. I do not say that this is or must be the rule.

tautschnig commented 6 years ago

If a state and a statement has no successor in the relation, then there is no program execution.

In this framework, would you have a designated final state to conclude a terminating program execution? That is, what is the successor of return 0; in int main() { return 0; } as presumably this code does have a program execution?

danieldietsch commented 6 years ago

Sure. Perhaps ignoring some intricacies of call/return lets assume we have some result variable res. Then for return expr we would have {(s,s')| s'=s[res -> I(expr)(s)]}, i.e., the successor state is the predecessor state where the value of res is replaced by the interpretation of expr in the predecessor state.

Edit: If there are no statements left we are just left with a finite program execution. So, we do not have some designated final state, just finite and infinite program executions induced by the successor relation.

tautschnig commented 6 years ago

Edit: If there are no statements left we are just left with a finite program execution. So, we do not have some designated final state, just finite and infinite program executions induced by the successor relation.

Doesn't that imply that assume(false) would 1) yield a (finite) program execution, and 2) yield the same program execution as having abort() at that point?

danieldietsch commented 6 years ago

No, because assume(false) does not have a successor state.

For abort() we also need to know where the next statement comes from (for me, from the control flow graph). Then we would probably require a designated sink location in the control flow graph for abort() s.t. there is no next statement. abort itself is then just an assume(true).

tautschnig commented 6 years ago

You said:

No, because assume(false) does not have a successor state.

but you also said:

[...] just finite and infinite program executions induced by the successor relation.

To me, this seems contradictory, but maybe I'm just cherry-picking the wrong subset of your comments?

To avoid rabbit holing: what I have not yet seen is a good reason why we need __VERIFIER_assume at all. @PhilippWendler said it should have semantics that cannot be easily simulated using C code, but it's not clear to me why that is a desirable property at all.

danieldietsch commented 6 years ago

assume(false) cannot have a successor state, because I would say that under any interpretation, false should evaluate to false. Because there is no successor state there cannot be a program execution that contains this statement.

I would say it is good and needed because we want to abstract from the environment sometimes. We want to be able to say "we consider this function in isolation but we know its parameter x will never be smaller or equal zero", so we write __VERIFIER_assume(x>0) and are sure that we did not change anything else about the program's behavior.

tautschnig commented 6 years ago

So why isn't "write a harness" the most appropriate way of abstracting from the environment? In such a harness, it's easy to place constraints on otherwise non-deterministic values without having to introduce extra magic.

danieldietsch commented 6 years ago

Because, e.g., you change the behaviour w.r.t. to termination by adding either a terminating or a non-terminating program execution.

tautschnig commented 6 years ago

You have full control over which of the cases you want by using one of while(x>100) {} or if(x>100) abort(); in the harness. All without coming up with new "special" semantics that create non-existing executions.

PhilippWendler commented 6 years ago

You have full control over which of the cases you want by using one of while(x>100) {} or if(x>100) abort(); in the harness. All without coming up with new "special" semantics that create non-existing executions.

This means that when you add a harness, you have to choose between either violating the termination property or (possibly) violating the memcleanup property. There is no known way with C code to avoid both, but this is precisely what I would say should be the goal of such a harness (restrict the state space with as little influence as possible on the rest of the program and properties). This is also the answer to https://github.com/sosy-lab/sv-benchmarks/issues/504#issuecomment-345040821.

tautschnig commented 6 years ago

This means that when you add a harness, you have to choose between either violating the termination property or (possibly) violating the memcleanup property. There is no known way with C code to avoid both, [...]

We are talking about a Turing-complete language, right? Speaking more concretely, you can obviously add a cleanup function and invoke that one instead of calling abort() directly.

danieldietsch commented 6 years ago

Depends on what your goal is. In many cases, it is just much harder to write an actual harness than using special modelling constructs.

I would argue we need even more magic. I would like to have, e.g., quantifiers in my assume to express things about the heap. I would like things that allow me to express that a pointer is valid and points to a certain data structure.

Sure you can do most of the things with harnesses, but they get more and more complicated and are additional pieces of code you have to maintain.