Open MartinSpiessl opened 4 years ago
Hi, @MartinSpiessl, the double type can't represent exactly the number 12345678901234567
, so all calculations that followed the statement a = 12345678901234567
would not be accurate.
I think we can limit the value of the a
variable to resolve this problem. I did a quick binary search and found the following bounds for Freire1 and 2:
Freire1:
int r
by double r
: a <= 9007199254740993 = 2^53 + 1Freire2: a <= 538359860
Hi, @MartinSpiessl, the double type can't represent exactly the number 12345678901234567, so all calculations that followed the statement a = 12345678901234567 would not be accurate.
Yeah, I didn't care to much whether this particular value can be represented. It will just lead to a specific value in a
that can be represented. Even if we set a to a double that can be represented, there is no guarantee that intermediate calculation results can still be represented accurately if that double value is high enough.
Thanks for checking out the bounds! I tried the first solution for freire1.c
with CPAchecker but I still found a violation, even if I set a<=1
. I looked at the model and found that -inf
was assigned to a. So I guess we would have to post a lower bound for a as well to exclude this extreme value (which is of course a double value, so I do not think this is an error in CPAchecker, __VERIFIER_nondet_double
should be allowed to output (any) NaN
and inf
value).
With a lower bound of 0
or -2147441940
CPAchecker runs into a timeout instead, which is good news, since this means the solver cannot easily detect a problem anymore. There might still be one, but that would be the other verifiers job to prove or disprove.
Having this as a verification task is therefore very interesting. Can you make a PR for this? Maybe also add a variant that only uses integers, the comparison would be especially nice. I will look over the PR, test the fixes with CPAchecker again and would approve if the violations cannot be found anymore.
We have created the integer version of freire1.c
and freire2.c
recently. I will also check the bounds for them and make a PR.
The programs freire{1,2}.c in
c/nla-digbench
are clearly flawed since they neglect the limited floating point precision.As an example, let's look at this concretized version of
freire1.c
:After compilation this will show us that the assert inside the loop can be violated after only 4 rounds (I introduced
i
to track numbers of rounds and also added someprintf
-calls for debugging:The initial value for
a
that I chose here leads to the assertion violation in just 4 steps. I was not able to get any less than 4 steps, sometimes it goes into the thousands. I also showed that by reordering of the terms in the assumption (grouping by type) we get a slightly better numeric stability, but even then the assertion would eventually be violated. This is inevitable because of the limited precision of IEEE floating-point values.@nguyenthanhvuh was aware that CPAchecker complained here since both files contain the following comment in the beginning:
Indeed, these counterexamples are probably valid as shown via my concrete example above. The problem here is that this specific algorithm of Freire was never stated for use with floating-point variables or even signed integers:
http://www.pedrofreire.com/sqrt/sqrt1.en.html
In order to fix these tasks, we would need to change the types to integers and make sure that none of the operations on the way overflow. Since the algorithm in freire1.c will calculate the next bigger square, i.e.
(floor(sqrt(a))+1)^2
, we would have to limit the maximum value ofa
such that this value is still belowINT_MAX
. Or just some value that is sufficiently lower thanINT_MAX
if we are lazy :wink:For negative values of
a
the algorithm will immediately break and the assertions actually hold, so we would not need to change the integer types to beunsigned
instead ofsigned
in order for the program to be correct (w.r.t. the assertions).We could probably also find constraints on the floating-point values of
a
that would allow us to fix these tasks while retaining the floating-point variables. It is always nice to have tasks that actually use floating-point variables, though they are intrinsically extremely hard to verify.