hernanponcedeleon / Dat3M

A verification tool for many memory models
MIT License
77 stars 29 forks source link

Assume event causes problems #127

Closed hernanponcedeleon closed 1 year ago

hernanponcedeleon commented 3 years ago

When assume_abort_if_not is parsed as an Assume event, the family of benchmarks pthread_wmm/rfi005_X return PASS instead of FAIL.

xeren commented 3 years ago

assume_abort_if_not should be equivalent to assert instead of assume. See its definition in i.e. rfi005_tso.oepc.c:

void assume_abort_if_not(int cond) {
  if(!cond) {abort();}
}
hernanponcedeleon commented 3 years ago

I don't agree (or I'm missing something). The definition of assume_abort_if_not says that the program should abort execution if cond is false, however the program should NOT fail.

If we use assert and cond is false, the verification will fail.

xeren commented 3 years ago

The difference to __VERIFIER_assert is the missing call to reach_error(). I assume abort() is just meant to be non-returning, while otherwise unimportant to the verification.

void __VERIFIER_assert(int expression) { if (!expression) { ERROR: {reach_error();abort();} }; return; }
hernanponcedeleon commented 3 years ago

Here there is a description of problems that occurred in SVCOMP due to people understanding the semantics of __VERIFIER_Assume differently.

As I see it, the semantics of our Assume event is the one described in point 3 (@ThomasHaas can you confirm this?).

Since some old SVCOMP benchmarks might use __VERIFIER_assume but with the intended semantics being as the one of the implementation of assume_abort_if_not listed above, I don't think we should parse __VERIFIER_assume as an Assume event. The same could be problematic if someone comes of with some C code defining a function a __VERIFIER_assert with a different implementation, but I think has low probability.

I think the best would be to have DAT3M specific commands (e.g. __DAT3M_assume, __ DAT3M_assert and __ DAT3M_nondet_X) and list this in our README.

ThomasHaas commented 3 years ago

As I see it, the semantics of our Assume event is the one described in point 3 (@ThomasHaas can you confirm this?).

Yes, the Assume I implemented has these semantics. In particular, it is a very special event that has no equivalent way of expressing it in C code directly.

Since some old SVCOMP benchmarks might use __VERIFIER_assume but with the intended semantics being as the one of the implementation of assume_abort_if_not listed above, I don't think we should parse __VERIFIER_assume as an Assume event. The same could be problematic if someone comes of with some C code defining a function a __VERIFIER_assert with a different implementation, but I think has low probability.

I think people shouldn't provide own implementations for methods with a __VERIFIER prefix. Many SVCOMP benchmarks do include code for __VERIFIER_assert but the way I see it, this is only to make the file self-contained and actually executable without relying on external verifier functions. In particular, I think __VERIFIER_assert should have some intended semantics that every verifier adheres to, even if the c-file gives a different implementation (i.e. for verification, if an implementation of any __VERIFIER function is provided it should simply be ignored.).

I think the best would be to have DAT3M specific commands (e.g. __DAT3M_assume, __ DAT3M_assert and __ DAT3M_nondet_X) and list this in our README.

I agree with this, but simply because our interpretation of __VERIFIER_X should match the SVCOMP semantics. As far as I see it, __ DAT3M_assert and __ DAT3M_nondet_X should be identical to the respective __VERIFIER version. However, it still makes sense to provide them for consistency, even if some of these special events have an equivalent __VERIFIER counterpart.

hernanponcedeleon commented 3 years ago

In particular, I think __VERIFIER_assert should have some intended semantics that every verifier adheres to, even if the c-file gives a different implementation (i.e. for verification, if an implementation of any __VERIFIER function is provided it should simply be ignored.).

This is how it used to be, however, there is an effort to reduce the use of SVCOMP specific functions and replace them with either functions having precise semantics defined by the C standard, or giving the precise implementation in the code for self-containment.

I agree with this, but simply because our interpretation of __VERIFIER_X should match the SVCOMP semantics. As far as I see it, __DAT3M_assert and __ DAT3M_nondet_X should be identical to the respective __VERIFIER version. However, it still makes sense to provide them for consistency, even if some of these special events have an equivalent __VERIFIER counterpart.

I don't think there is anyway of getting rid of __VERIFIER_nondet_X. We could also avoid parsing __VERIFIER_assert, every SVCOMP file using it would have an implementation. I added to because the implementation introduces some new control flow that I wanted to avoid. But if we want to be as general as possible, the best might be getting rid of it. I already explained that we also need to get rid of __VERIFIER_assert.

@xeren can you update your PR for all mentioned above. The following is need

ThomasHaas commented 3 years ago

To address the overall issue: assume_abort_if_not acts like a "pseudo-assume". It is definitely not an assert cause it doesn't provide a safety specification (unless we look at termination instead of reachability). The intention (as the name suggests) is to cut off executions that do not meet some pre-conditions. However, unlike a true assume it doesn't forbid the complete execution, it simply terminates early. This make a difference in scenarios where some assertion is violated AND simultaneously the assume gets violated. Consider e.g.

assert(false)
assume(false)

If we interpret assume as the usual assume which forbids executions, than the above program has no executions at all and hence is safe. However, with the abort semantics, this code has a single execution that violates the assertion. Now, usually assumes are used to enforce pre-conditions, that is, they should be BEFORE the assertion. In that case, the difference of early termination vs. forbidding execution is almost non-existent. However, one might imagine a very special case where the assertion violation of one thread T1 somehow implies that another thread will fail its assume and abort (or vice versa, if one thread aborts another will reach an assertion violation). I'm not exactly sure, but I think our abort semantics actually only terminates the executing thread but no other thread.

In the case of rfi005_tso.oepc.c, I think we have the following case:

With abort semantics, this program would be a FAIL (as we know). With the other semantics, this program would not have any violating execution since the assertion failure of P1 implies the assumption failure of main.

ThomasHaas commented 3 years ago

This is how it used to be, however, there is an effort to reduce the use of SVCOMP specific functions and replace them with either functions having precise semantics defined by the C standard, or giving the precise implementation in the code for self-containment.

I understand the want to reduce special functions. But then I find it very unfortunate (or actually wrong) to give these implementations the __VERIFIER prefix. If those are normal functions then they shouldn't have this prefix to begin with unless they are intended to be treated differently by a verifier.

xeren commented 3 years ago

I see now that include/assert.h effectively equalizes assert to __VERIFIER_assert. I propose removing the latter entirely, while reimplementing assert accordingly, since we already define built-ins reach_error() and abort().

xeren commented 3 years ago

In our current implementation, the program from https://github.com/hernanponcedeleon/Dat3M/issues/127#issuecomment-932537343 contains dead code.

ThomasHaas commented 3 years ago

Of course, the simple program in my example wouldn't have a problem since the assert(false) terminates the thread and hence the subsequent assume(false) is never reached. However, as soon as you have 2 threads, you can run into this issue (see rfi005_tso.oepc.c). Another problem that plays into this is the fact that our abort only aborts the thread, not the whole program. Actually, I'm not sure where abort is even handled in the code. But reach_error should not be a built-in and I don't think it is (right now we ignore the body of VERIFIER_assert anyway).

I guess we can implement assert with the semantics of VERIFIER_assert, in particular we terminate the thread on assertion failure. The parsed code of VERIFIER_assert would then reduce to the (redundant) code assert(0);abort() which is equivalent to just assert(0) (only for the case of a failed assertion). Do we want this, or should a simple assert not terminate the thread on failure (and if it does not, what should it do?).

hernanponcedeleon commented 3 years ago

@ThomasHaas is right that once an assertion is violated, the execution at the C level would just stop. For verification however, we "continue" the execution. I don't think this is really a problem because anyway we will report that a violation was found; if more events than necessary where executed, this is irrelevant from the point of view of verification.

About the fact that abort only halts the current thread: this is not a problem because the only way (at least that comes to my mind) events of other threads could not be scheduled before the abort is if there is a shared mutex with the current thread which blocks the other thread, but since the execution is blocked, it will be equivalent to terminating that thread.

About what command to use for writing assertions: the optimal way of handling this would be to use the assert.h library. Unfortunately, when converting to LLVM and then to boogie, I have seen assert() converted to many different variations including __assert_rtn, assert_.i32, __assert_fail and assert_. I think this depends on the compiler and maybe even the underlying architecture. We still support many of these, but since we cannot guarantee that assert(exp) would be compiled to some assert_whatever(exp), I think the best is to use a custom command, e.g. DAT3M_assert and instruct the user to use this one in the documentation. However, we need to add some dat3m.h library (and add to our compilation cmd here and here) defining this (and any other custom command as DAT3M_assert and DAT3M_nondet_X) as extern functions.

There are quite a few commands that get parsed as assert (precisely, they are parsed into local events assert_index <- exp and then we have a reachability condition \/ assert_i). All the weird tags I mentioned above result in such local event. A call to reach_error() gets parsed as assert_index <- 0. In SVCOMP there is no notion of assertion. SVCOMP is more than just reachability problems, they have a whole language of properties. For the reachability category, the actual problem to be solved is "is there any execution which calls reach_error()?". This is defined by this file. I think we should also document that reach_error() has special semantics for us and the user should not use this name for other purposes.

ThomasHaas commented 3 years ago

@ThomasHaas is right that once an assertion is violated, the execution at the C level would just stop. For verification however, we "continue" the execution. I don't think this is really a problem because anyway we will report that a violation was found; if more events than necessary where executed, this is irrelevant from the point of view of verification.

Well, that is where the problem with assume comes in, since a later assume may invalidate the execution if the assertion does not cause a complete termination (which it doesn't). We could always go with the assume_abort_if_not semantics for our assume though. This semantics, however, do affect the control flow (analysis).

About the fact that abort only halts the current thread: this is not a problem because the only way (at least that comes to my mind) events of other threads could not be scheduled before the abort is if there is a shared mutex with the current thread which blocks the other thread, but since the execution is blocked, it will be equivalent to terminating that thread.

Well, we also have pthread_join which caused the problem in rfi005_tso.oepc.c when used together with assume. So even without a mutex, the main thread can easily halt and wait for other threads to finish (or to abort!).

About what command to use for writing assertions: the optimal way of handling this would be to use the assert.h library. Unfortunately, when converting to LLVM and then to boogie, I have seen assert() converted to many different variations including __assert_rtn, assert_.i32, __assert_fail and assert_. I think this depends on the compiler and maybe even the underlying architecture. We still support many of these, but since we cannot guarantee that assert(exp) would be compiled to some assert_whatever(exp), I think the best is to use a custom command, e.g. DAT3M_assert and instruct the user to use this one in the documentation. However, we need to add some dat3m.h library (and add to our compilation cmd here and here) defining this (and any other custom command as DAT3M_assert and DAT3M_nondet_X) as extern functions.

But we include a custom assert.h that (right now) simply translates an assert call to a VERIFIER_assert call. Shouldn't such a include force the compiler to treat it as a function instead of some intrinsic? Couldn't we provide assert itself as an extern without any implementation?

I think we should also document that reach_error() has special semantics for us and the user should not use this name for other purposes.

I think reach_error should (if at all) be treated as something specific to SVCOMP. I think we should rather use assert as the default way of specifying assertions (or alternatively translate it to DAT3M_assert) and simply rely on the fact that reach_error is implemented as assert(0). Or are there any benchmarks where no implementation is provided?

xeren commented 3 years ago

Currently, VisitorBoogie transforms reach_error() as it does VERIFIER_assert(0).

Unfortunately, when converting to LLVM and then to boogie, I have seen assert() converted to many different variations including __assert_rtn, assert_.i32, __assert_fail and assert_.

I get that Smack would use assert_ to avoid the Boogie keyword, but where did you encounter the other ones? I would expect assert_.i32 in a program where the source code did not declare the function, and __assert_rtn and __assert_fail where another assert.h is included with something like #define assert(x) __assert_rtn((x)).

hernanponcedeleon commented 3 years ago

Well, that is where the problem with assume comes in, since a later assume may invalidate the execution if the assertion does not cause a complete termination (which it doesn't). We could always go with the assume_abort_if_not semantics for our assume though. This semantics, however, do affect the control flow (analysis).

There are two cases if we have

assert(e1)
assume(e2)

But we include a custom assert.h that (right now) simply translates an assert call to a VERIFIER_assert call. Shouldn't such a include force the compiler to treat it as a function instead of some intrinsic? Couldn't we provide assert itself as an extern without any implementation?

I was now aware of this header. This might be the reason why all the latest benchmarks (locks) result in just assert_. I was not able to find any boogie file currently using any of the weird tags, but I have seen those in the past. It might be that by using our custom assert.h we got rid of those. If this is the case, I agree we can completely remove the implementation of assert and mark it as extern. We can also completely remove __VERIFIER_assert from there: this function by itself does not have ny semantics and every C file using it should provide an implementation (in SVCOMP, the implementation is always in terms of reach_error and abort).

I think we should also document that reach_error() has special semantics for us and the user should not use this name for other purposes.

I think reach_error should (if at all) be treated as something specific to SVCOMP. I think we should rather use assert as the default way of specifying assertions (or alternatively translate it to DAT3M_assert) and simply rely on the fact that reach_error is implemented as assert(0). Or are there any benchmarks where no implementation is provided?

I don't really see what is the difference between adding the implementation in a header file or just parsing it as an assert. EDIT: well if we eventually want to get rid of the special semantics of reach_error, it is easier to just remove it from the header than modifying the code. So I'm ok to adding the custom implementation in the header file.

ThomasHaas commented 3 years ago

Well, that is where the problem with assume comes in, since a later assume may invalidate the execution if the assertion does not cause a complete termination (which it doesn't). We could always go with the assume_abort_if_not semantics for our assume though. This semantics, however, do affect the control flow (analysis).

There are two cases if we have

assert(e1)
assume(e2)
* if `~e1 => ~e2` then our assume semantics would forbid the execution, but this is the intended behaviour according to my understanding.

* if `~(~e1 => ~e2)` then there exists an execution which won't be filtered by the assume and would violate the assertions. We should be able to detect this.

That is not true if an assertion violation terminates the thread. The following assume won't be able to forbid the execution since it doesn't get executed. However, by having the assume and the assert in different threads, we can actually get the assume to block the execution. I think that is rather counter-intuitive. Either way, for SVCOMP we only need to support assume_abort_if_not which never forbids complete executions so the semantics are clear. When we intend to implement DAT3M_assume we will have to tackle the question though.

Regarding the assert/VERIFIER_assert/reach_error, I think we should just do the following:

hernanponcedeleon commented 3 years ago

That is not true if an assertion violation terminates the thread.

Are you talking about "execution level" or "verification level"? If it is at the verification level, what do you mean by an assertion violation terminates the thread? For us, an assert(exp) would just be parsed as a assert_i <- exp and we don't add any CF or whatsoever to terminate the thread.

Either way, for SVCOMP we only need to support assume_abort_if_not which never forbids complete executions so the semantics are clear.

We don't really have to support anything for this, assume_abort_if_not always has the following implementation

void assume_abort_if_not(int cond) {
  if(!cond) {abort();}
}

Regarding the assert/VERIFIER_assert/reach_error, I think we should just do the following:

  • Provide a custom header file for assert to avoid the compiler modifying the method calls in any way.
  • Forget about reach_error and VERIFIER_assert and simply treat them as normal functions (their implementations are provided in all benchmarks anyway)

For SVCOMP benchmarks, we have the following implementation

void reach_error() { assert(0); }

which after preprocessing the file (these are the ones used for the competition) gets converted to

void reach_error() { ((void) sizeof ((0) ? 1 : 0), __extension__ ({ if (0) ; else __assert_fail ("0", "fib_bench_longer-1.c", 3, __extension__ __PRETTY_FUNCTION__); })); }

so we need to be sure we parse this correctly, i.e. we need to support __assert_fail

ThomasHaas commented 3 years ago

That is not true if an assertion violation terminates the thread.

Are you talking about "execution level" or "verification level"? If it is at the verification level, what do you mean by an assertion violation terminates the thread? For us, an assert(exp) would just be parsed as a assert_i <- exp and we don't add any CF or whatsoever to terminate the thread.

Oh, I thought we always added termination code. If we don't, then there shouldn't be a problem.

Either way, for SVCOMP we only need to support assume_abort_if_not which never forbids complete executions so the semantics are clear.

We don't really have to support anything for this, assume_abort_if_not always has the following implementation

void assume_abort_if_not(int cond) {
  if(!cond) {abort();}
}

Of course, we don't need to support it specifically. We just need to support abort correctly.

Regarding the assert/VERIFIER_assert/reach_error, I think we should just do the following:

  • Provide a custom header file for assert to avoid the compiler modifying the method calls in any way.
  • Forget about reach_error and VERIFIER_assert and simply treat them as normal functions (their implementations are provided in all benchmarks anyway)

For SVCOMP benchmarks, we have the following implementation

void reach_error() { assert(0); }

which after preprocessing the file (these are the ones used for the competition) gets converted to

void reach_error() { ((void) sizeof ((0) ? 1 : 0), __extension__ ({ if (0) ; else __assert_fail ("0", "fib_bench_longer-1.c", 3, __extension__ __PRETTY_FUNCTION__); })); }

so we need to be sure we parse this correctly, i.e. we need to support __assert_fail

Oh wow, the resulting function looks unnecessarily complicated. But if there is preprocessing going on before we even get to treat the function, we should define reach_error as a svcomp procedure. It doesn't make sense to treat it as a normal function if there is some preprocessing that translates it to something different (I would rather handle reach_error directly than introduce __assert_fail as another built-in).

hernanponcedeleon commented 3 years ago

Ok, then we agree that we use our own assert.h library and parse the corresponding LLVM version (based on the lock benchmarks like this one, the LLVM version is assert_) as a Local event and we do exactly the same to reach_error.

@xeren can you implement this in #133? Also, I saw your modified some of the bpl files. Based on the discussion above, this is not needed.

xeren commented 3 years ago

This leaves us with six ways of inserting an reachability condition at Boogie-level:

Although abort() is expected to be called from a failing assert, we let the program continue, as there is no gain to the dataflow.

hernanponcedeleon commented 2 years ago

I just noticed that despite VERIFIER_Assume not being catched by SVCOMPPROCEDURES, since smack provides implementations to all SVCOMP special functions, we end up giving semantic to such functions calls. The implementation by smack plus our way of handling boogie assumes, result in jumping to the end of the thread if the condition does not hold.

ThomasHaas commented 1 year ago

Is this issue still relevant? I think we have clarified the semantics of __VERIFIER_assume and that its semantics is not identical to assume_abort_if_not and hence should not be used in SVCOMP.

To be precise, the semantics of __VERIFIER_assume is that any full execution that violates an assumption is simply invalid and gets filtered out. __VERIFIER_assume has only two real use-cases: (i) restricting non-deterministic values returned by __VERIFIER_nondet_X and (ii) intentionally cutting of control-flow to reduce the problem size by placing __VERIFIER_assume(false) somewhere. It should NOT be used to model conditionals of any form, i.e., model actual program logic.

hernanponcedeleon commented 1 year ago

Yes, the semantics are clarified. SVCOMP should not be a problem since __VERIFIER_assume was removed from the functions with special semantics (and there is a check in the CI for this). When needed, they use assume_abort_if_not, but this should have an implementation with its intended semantics.

Closing.