Closed ClementPernet closed 5 years ago
modular-integral implementation of add and addin are
r = a + b;
return (r>p || r<a ? r -= p : r);
Now, when Element is int32_t and a and b are close to 2^31-1, a+b overflows, thus causing r to be r - 2^32, which is catched by the test.
Now if (r-2^32)-p underflows, then return gives (correctly !!!) 2^32+(r-2^32)-p=r-p as expected.
It seems that clang++ -O2 sometimes fails to catch the overflow and leaves r to e.g. a negative value.
This hard to see. Indeed when adding some traces, clang++ gives again the correct value and a mwe does not reproduce it ...
The problem shows in test-ringarith.C for F.addin and sometimes for F.add when used with ModularZULL = Modular<int32_t, uint64_t> and modulus ZULLmax, i.e., 2^(N-1) - 1, here 2^31-1=2147483647
I found a MWE (see below) that always fails with clang++-6.0 with -O2 -Wall -g -DNDEBUG -UGIVARO_DEBUG -UDEBUG
and works with clang++-6.0 with -O0 -Wall -g -UNDEBUG -DGIVARO_DEBUG -DDEBUG
and g++ (8.3.0) in both cases.
Edit: actually, g++ has the same behavior as clang++ and fails with -O2 and succeeds with -O0
The example shows that the problem is not only in addin but in the sequence axmy, addin.
#include <iostream>
#include "givaro/modular.h"
using namespace Givaro;
int
main (int argc, char *argv[])
{
using Ring = Modular<int32_t, uint64_t>;
Ring R(Ring::maxCardinality());
std::cout << "Modular<int32_t, uint64_t>" << " with m=" << R.cardinality()
<< std::endl;
typename Ring::Element x, y, z, r1, r2;
R.init (x, 7);
R.init (y, 2147483618);
R.init (z, 2147483625);
std::cout << "x = " << x << std::endl;
std::cout << "y = " << y << std::endl;
std::cout << "z = " << z << std::endl;
R.init (r1);
R.init (r2);
R.init (r1, 2147483466); /* r1 = x*y - z */
R.addin (r1, z);
R.axmy (r2, x, y, z); /* r2 = x*y - z */
std::cout << "axmy = " << r2 << " # = x*y-z" << std::endl;
R.addin (r2, z); // r2 += z
std::cout << "r1 = " << r1 << std::endl;
std::cout << "r2 = " << r2 << std::endl;
if (!R.areEqual (r1, r2))
{
std::cout << "failure" << std::endl;
return 1;
}
else
{
std::cout << "ok" << std::endl;
return 0;
}
}
/* -*- mode: C++; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4 -*- */
// vim:sts=4:sw=4:ts=4:et:sr:cino=>s,f0,{0,g0,(0,\:0,t0,+0,=s
Well I tried to make a MWE without givaro, by extracting the add code directly with int32_t etc., and there the difference disappears. The difference disappears also if I add some printing traces within Givaro ...
I tried to find a smaller example without calls to Givaro's functions. Here what I have for now:
#include <iostream>
#include "givaro/modular.h"
using namespace Givaro;
#define common(r) static_cast<int32_t>((static_cast<uint64_t>(a)*static_cast<uint64_t>(b) + m-Caster<uint64_t>(c)) % m); \
r += c; \
r = (r >= Caster<int32_t>(mp) || r < c) ? r - static_cast<int32_t>(mp) : r;
static inline void
foo (int32_t &r, int32_t a, int32_t b, int32_t c, uint64_t m, uint32_t mp)
{
r = common(r);
}
int
main (int argc, char *argv[])
{
int32_t a = 7;
int32_t b = 2147483618;
int32_t c = 2147483625;
uint64_t m = 2147483647;
uint32_t mp = 2147483647;
std::cout << "a = " << a << std::endl;
std::cout << "b = " << b << std::endl;
std::cout << "c = " << c << std::endl;
/* r1 */
int32_t r1 = common(r1);
std::cout << "r1 = " << r1 << std::endl;
/* r2 */
int32_t r2;
foo (r2, a, b, c, m, mp);
std::cout << "r2 = " << r2 << std::endl;
if (r1 != r2)
{
std::cout << "failure" << std::endl;
return 1;
}
else
{
std::cout << "ok" << std::endl;
return 0;
}
}
r1
and r2
are computed with the same code (from the macro common
), with the difference that r1
is computed directly and r2
via a function call (static inline function).
clang++-6.0 -O2 -Wall -g -o bug bug.C
: r1
is correct but not r2
(r2 = -205
)clang++-6.0 -O0 -Wall -g -o bug bug.C
: r1
and r2
are correctThis is weird, but it gets wierder (in the -O2
case):
Caster
with a static_cast
, r1
and r2
are both correctCaster
with a static_cast
, r1
and r2
are both wrong (equal to -205)[ with -O0
, this is always correct]
After hours trying to understand this bug, I suspected a bug in the optimisation step of clang++-6.0
If I compile the code of my previous message without any printing and that I use objdump
to analyse the produced code, I obtain:
for clang++-6.0, assembly code that do only return 1
.
Assembly code is obtained with
clang++-6.0 -O2 -Wall -c -o bug.clang.O2.o bug.C
objdump -drwC -S bug.clang.O2.o > bug.clang.O2.objdump
for g++, assembly code that do only return 0
.
g++ -O2 -Wall -c -o bug.g++.O2.o bug.C
objdump -drwC -S bug.g++.O2.o > bug.g++.O2.objdump
In both case, the compiler optimises everything out as every operand are known at compile time. But clang++-6.0 and g++ does not agree in the output.
Hi Cyril,
If I'm not wrong, over/underflows with signed int is undefined behaviour, so the compilers would have the "right" to optimise a test to a constant if it detects the overflow. I didn't look at this bug in detail, but I have the impression that this is what's happening here? If that's indeed the problem, a solution would be to cast to unsigned to do the overflows, where in this case the behaviour is well defined (as we did here #98)?
(For instance, redefining common
as following does seem to resolve the issue with clang:
#define common(r) static_cast<int32_t>((static_cast<uint64_t>(a)*static_cast<uint64_t>(b) + m-Caster<uint64_t>(c)) % m); \
uint32_t rr,cc; \
rr = static_cast<uint32_t>(r); \
cc = static_cast<uint32_t>(c); \
rr += cc; \
r = static_cast<int32_t>(rr); \
r = (r >= Caster<int32_t>(mp) || r < c) ? r - static_cast<int32_t>(mp) : r;
)
@P1K I think your assessment is correct: as over/underflows with signed int is undefined behaviour, it is a ill-formed program and compilers can optimise it as they want.
So I started reading section 4 of the C++11 norm (on standard conversions) and I am not sure that
r = static_cast<int32_t>(rr);
is correct:
static_cast
performs an "implicit conversion sequence" (according to cppreference)So I think all the computations should be done with unsigned type, with one conversion to signed at the end. The question now is how to do this in givaro. We have three available type:
Okay. I don't know if static_cast
is the way to go, and indeed making all computations on unsigned types would seem better. But I have no idea about how to do that properly in givaro.
(For instance, redefining
common
as following does seem to resolve the issue with clang:#define common(r) static_cast<int32_t>((static_cast<uint64_t>(a)*static_cast<uint64_t>(b) + m-Caster<uint64_t>(c)) % m); \ uint32_t rr,cc; \ rr = static_cast<uint32_t>(r); \ cc = static_cast<uint32_t>(c); \ rr += cc; \ r = static_cast<int32_t>(rr); \ r = (r >= Caster<int32_t>(mp) || r < c) ? r - static_cast<int32_t>(mp) : r;
)
(Note that a cast/unsigned computation should probably be done for r - static_cast<int32_t>(mp)
too.)
Branch clangbug has a first draft of a fix for this. Need to be simplified though.
solved in PR #111
Since #109 was merged, the testsuite is now compiled with -O2. This leads to 2 failures in the testsuite when compiling with clang:
test-ringarith
andtest-ffarith