Closed noloader closed 11 months ago
I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build.
CFLAGS
andCXXFLAGS
include:-fsanitize=undefined -fno-sanitize-recover
Then I run
make
andmake check
.It looks like there's some undefined behavior in the Perl sources:
re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' # Failed test 2258 - Verify compilation of [\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}]
Hmm, the test here is: if (flags & (1<<bit))
, but flags is declared as U32 which should surely be represented by an unsigned int. This could be a deeper problem in your build.
lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long int' FAILED--expected 15 tests, saw 0
and:
dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int' FAILED--expected 18 tests, saw 12
The pp_i_foo
functions are for arithmetic in the scope of use integer
; this is documented to give whatever overflow behaviour your computer gives at the C level. See perldoc integer
.
Hugo
On 1/30/20 6:42 AM, Jeffrey Walton wrote:
Hi Everyone,
I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build.
|CFLAGS| and |CXXFLAGS| include:
|-fsanitize=undefined -fno-sanitize-recover |
It looks like there's some undefined behavior in the Perl sources:
|re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' # Failed test 2258 - Verify compilation of [\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}] |
and:
|lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long int' FAILED--expected 15 tests, saw 0 |
and:
|dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int' FAILED--expected 18 tests, saw 12 |
The |re_comp.c:20010| failure can probably be cleared by casting to an unsigned int. |1U << 31| does not suffer the UB. That will clear a fair number of findings.
I'm not sure about the |pp.c:2707| and |pp.c:2621| findings.
There are other failures. They are easy enough to duplicate when using UBsan.
It may be a good idea to add Asan, Msan and UBsan testing to the Perl CI pipeline.
I believe that this is the first time that UBsan has been mentioned on this list. Is the material in the following links reliable?
https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/
https://www.kernel.org/doc/html/latest/dev-tools/ubsan.html
Is UBsan relevant for compilers other than gcc and g++?
Thank you very much. Jim Keenan
On 1/30/20 6:42 AM, Jeffrey Walton wrote:
Hi Everyone,
I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build.
|CFLAGS| and |CXXFLAGS| include:
|-fsanitize=undefined -fno-sanitize-recover |
Can you provide either (a) the complete invocation to ./Configure you are using to generate this build; or (b) attach the output of './perl -Ilib -V' as a text file?
It looks like there's some undefined behavior in the Perl sources:
|re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' # Failed test 2258 - Verify compilation of [\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}] |
and:
|lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long int' FAILED--expected 15 tests, saw 0 |
and:
|dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int' FAILED--expected 18 tests, saw 12 |
The |re_comp.c:20010| failure can probably be cleared by casting to an unsigned int. |1U << 31| does not suffer the UB. That will clear a fair number of findings.
I'm not sure about the |pp.c:2707| and |pp.c:2621| findings.
There are other failures. They are easy enough to duplicate when using UBsan.
It may be a good idea to add Asan, Msan and UBsan testing to the Perl CI pipeline.
Thank you very much. Jim Keenan
On 1/30/20 8:00 AM, James E Keenan wrote:
On 1/30/20 6:42 AM, Jeffrey Walton wrote:
Hi Everyone,
I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build.
|CFLAGS| and |CXXFLAGS| include:
|-fsanitize=undefined -fno-sanitize-recover |
It looks like there's some undefined behavior in the Perl sources:
|re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' # Failed test 2258 - Verify compilation of [\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}] |
and:
|lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long int' FAILED--expected 15 tests, saw 0 |
and:
|dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int' FAILED--expected 18 tests, saw 12 |
The |re_comp.c:20010| failure can probably be cleared by casting to an unsigned int. |1U << 31| does not suffer the UB. That will clear a fair number of findings.
I'm not sure about the |pp.c:2707| and |pp.c:2621| findings.
There are other failures. They are easy enough to duplicate when using UBsan.
It may be a good idea to add Asan, Msan and UBsan testing to the Perl CI pipeline.
I believe that this is the first time that UBsan has been mentioned on this list. Is the material in the following links reliable?
https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/
At the above URL I read:
"To enable UBSAN configure kernel with: CONFIG_UBSAN=y"
So, to use UBsan, is one required to recompile the entire Linux kernel? That doesn't sound feasible for most developers.
Also, if I were to install a Debian package for this functionality (or a pkg for *BSD), which should I use?
[snip]
Thank you very much. Jim Keenan
@jkeenan ,
I believe that this is the first time that UBsan has been mentioned on this list. Is the material in the following links reliable?
https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/
Yes.
To use the sanitizers you just add the appropriate compiler options to CFLAGS
and CXXFLAGS
. The support is baked into some compilers. There is not much too it, other than a custom build using extra CFLAGS
and CXXFLAGS
.
No.
Is UBsan relevant for compilers other than gcc and g++?
Yes. Clang also provides them. I'm not sure about Intel's ICC or other compilers, like IBM XL C/C++. (IBM's compiler on Linux now uses LLVM's front-end).
It is usually enough to test with GCC or Clang. The other compilers receive the herd immunity.
@jkeenan,
Can you provide either (a) the complete invocation to ./Configure you are using to generate this build; or (b) attach the output of './perl -Ilib -V' as a text file?
My Configure
for UBsan looks like this. I used /home/jwalton/tmp/ubsan
as prefix.
PERL_PKGCONFIG: /home/jwalton/tmp/ubsan/lib/pkgconfig
PERL_CPPFLAGS: -I/home/jwalton/tmp/ubsan/include -DNDEBUG -DTEST_UBSAN=1
PERL_CFLAGS: -g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread
PERL_CXXFLAGS: -g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread
PERL_LDFLAGS: -L/home/jwalton/tmp/ubsan/lib -fsanitize=undefined -fno-sanitize-recover -Wl,-R,'$$ORIGIN/../lib' -Wl,-R,/home/jwalton/tmp/ubsan/lib -Wl,--enable-new-dtags
if ! ./Configure -des \
-Dprefix="$TEST_PREFIX" \
-Dlibdir="$TEST_LIBDIR" \
-Dpkgconfig="$PERL_PKGCONFIG" \
-Acppflags="$PERL_CPPFLAGS" \
-Accflags="$PERL_CFLAGS" \
-Aldflags="$PERL_LDFLAGS" \
-Dextras="Test::More Text::Template"
then
echo "Failed to configure Perl"
exit 1
fi
And:
$ "$HOME/tmp/ubsan/bin/perl" -Ilib -V
Summary of my perl5 (revision 5 version 30 subversion 1) configuration:
Platform:
osname=linux
osvers=5.0.0-37-generic
archname=x86_64-linux
uname='linux coffee 5.0.0-37-generic #40~18.04.1-ubuntu smp thu nov 14 12:06:39 utc 2019 x86_64 x86_64 x86_64 gnulinux '
config_args='-des -Dprefix=/home/jwalton/tmp/ubsan -Dlibdir=/home/jwalton/tmp/ubsan/lib -Dpkgconfig=/home/jwalton/tmp/ubsan/lib/pkgconfig -Acppflags=-I/home/jwalton/tmp/ubsan/include -DNDEBUG -DTEST_UBSAN=1 -Accflags=-g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread -Aldflags=-L/home/jwalton/tmp/ubsan/lib -fsanitize=undefined -fno-sanitize-recover -Wl,-R,'$$ORIGIN/../lib' -Wl,-R,/home/jwalton/tmp/ubsan/lib -Wl,--enable-new-dtags -Dextras=Test::More Text::Template'
hint=recommended
useposix=true
d_sigaction=define
useithreads=undef
usemultiplicity=undef
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='cc'
ccflags ='-g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
optimize='-O2'
cppflags='-I/home/jwalton/tmp/ubsan/include -DNDEBUG -DTEST_UBSAN=1 -g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
ccversion=''
gccversion='7.4.0'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='cc'
ldflags =' -L/home/jwalton/tmp/ubsan/lib -fsanitize=undefined -fno-sanitize-recover -Wl,-R,21651ORIGIN/../lib -Wl,-R,/home/jwalton/tmp/ubsan/lib -Wl,--enable-new-dtags -fstack-protector-strong -L/usr/local/lib'
libpth=/home/jwalton/tmp/ubsan/lib /usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
libc=libc-2.27.so
so=so
useshrplib=false
libperl=libperl.a
gnulibc_version='2.27'
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags='-Wl,-E'
cccdlflags='-fPIC'
lddlflags='-shared -O2 -L/home/jwalton/tmp/ubsan/lib -Wl,-R,21651ORIGIN/../lib -Wl,-R,/home/jwalton/tmp/ubsan/lib -L/usr/local/lib -fstack-protector-strong'
Characteristics of this binary (from libperl):
Compile-time options:
HAS_TIMES
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
USE_64_BIT_ALL
USE_64_BIT_INT
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
Built under linux
Compiled at Jan 30 2020 06:48:11
@INC:
lib
/home/jwalton/tmp/ubsan/lib/perl5/site_perl/5.30.1/x86_64-linux
/home/jwalton/tmp/ubsan/lib/perl5/site_perl/5.30.1
/home/jwalton/tmp/ubsan/lib/perl5/5.30.1/x86_64-linux
/home/jwalton/tmp/ubsan/lib/perl5/5.30.1
@jkeenan,
This GitHub may be helpful for you: Noloader | Build-Scripts. I use them for installing modern tools on old systems. I also use them for isolated Sanitizer testing.
If you clone the repo on a machine with modern build tools and modern wget, then you should only need to perform the following. The heavy lifting is performed by the scripts. It builds dependencies as needed, and sets runtime paths so you don't need those stupid LD_LIBRARY_PATHs
to run your program.
INSTX_UBSAN=1 INSTX_PREFIX=$HOME/tmp/ubsan ./build-perl.sh
The scripts always use the latest release tarball.
In Perl's case, the script can use Master by changing build-perl.sh
to perform a git clone
rather than a wget
fetch. You have to do that by hand.
On Thu, 30 Jan 2020, 20:58 Hugo van der Sanden, notifications@github.com wrote:
I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build. CFLAGS and CXXFLAGS include:
-fsanitize=undefined -fno-sanitize-recover
Then I run make and make check.
It looks like there's some undefined behavior in the Perl sources:
re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
Failed test 2258 - Verify compilation of
[\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}]
Hmm, the test here is: if (flags & (1<<bit)), but flags is declared as U32 which should surely be represented by an unsigned int. This could be a deeper problem in your build.
The problem is the type of 1 is ambiguous and some compilers will assume a signed int. It should be 1U as Jeffery stated.
Yves
lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow:
-9223372036854775808 - 1 cannot be represented in type 'long int' FAILED--expected 15 tests, saw 0
and:
dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int' FAILED--expected 18 tests, saw 12
The pp_i_foo functions are for arithmetic in the scope of use integer; this is documented to give whatever overflow behaviour your computer gives at the C level. See perldoc integer.
Hugo
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17508?email_source=notifications&email_token=AAAZ5R2SH25KBERSNAYJWNDRALFJBA5CNFSM4KNUPNQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKK4LVQ#issuecomment-580240854, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5RYWXIYLT4ENWWJAGV3RALFJBANCNFSM4KNUPNQQ .
The problem is the type of 1 is ambiguous and some compilers will assume a signed int. It should be 1U as Jeffery stated.
Ah good point, my mistake.
Hugo
This appears to clear the undefined behavior in the shifts for the 5.30.1 release tarball.
I hit it with a sledge hammer using sed
. Confirming to a coding standard was not a priority for me.
sed -i 's/1 <</1U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/2 <</2U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/3 <</3U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/1L <</1U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/2L <</2U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/3L <</3U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/1<</1U<</g' op.c
sed -i 's/1 <</1U <</g' op.c
sed -i 's/1<</1U<</g' regcomp.c
sed -i 's/1 <</1U <</g' regcomp.c
sed -i 's/1<</1U<</g' vms/vms.c
sed -i 's/1 <</1U <</g' vms/vms.c
Here's the resulting patch I am using: perl.patch.
I'm going to work on the integer problems next.
@jkeenan,
[the behavior] is documented to give whatever overflow behaviour your computer gives at the C level.
Ouch... That creates some extra work for the Perl maintainers. To avoid the undefined behavior in C but preserve existing behavior you have to use inline ASM. ASM does not suffer C rules.
Here's what it looks like on x86_64 using GCC compatible compilers. The patch passes self-tests using UBsan.
--- pp.c
+++ pp.c
@@ -32,6 +32,13 @@
#include "reentr.h"
#include "regcharclass.h"
+/* Inline ASM due to testing with UBsan and undefined behavior. ASM does
+ * not suffer C rules. See https://github.com/Perl/perl5/issues/17508.
+ */
+#if defined(__GNUC__) || defined(__clang__) || defined(__INTEL_COMPILER)
+# define GCC_INLINE_ASM 1
+#endif
+
/* variations on pp_null */
PP(pp_stub)
@@ -2618,7 +2625,13 @@
tryAMAGICbin_MG(mult_amg, AMGf_assign);
{
dPOPTOPiirl_nomg;
+#if defined(GCC_INLINE_ASM) && (defined(__x86_64__) || defined(__amd64__))
+ long int a;
+ __asm__ ("imul %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
+ SETi( a );
+#else
SETi( left * right );
+#endif
RETURN;
}
}
@@ -2637,7 +2650,7 @@
/* avoid FPE_INTOVF on some platforms when num is IV_MIN */
if (value == -1)
- value = - num;
+ value = -(unsigned long int)num;
else
value = num / value;
SETi(value);
@@ -2693,7 +2706,13 @@
tryAMAGICbin_MG(add_amg, AMGf_assign);
{
dPOPTOPiirl_ul_nomg;
+#if defined(GCC_INLINE_ASM) && (defined(__x86_64__) || defined(__amd64__))
+ long int a;
+ __asm__ ("add %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
+ SETi( a );
+#else
SETi( left + right );
+#endif
RETURN;
}
}
@@ -2704,7 +2723,13 @@
tryAMAGICbin_MG(subtr_amg, AMGf_assign);
{
dPOPTOPiirl_ul_nomg;
+#if defined(GCC_INLINE_ASM) && (defined(__x86_64__) || defined(__amd64__))
+ long int a;
+ __asm__ ("sub %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
+ SETi( a );
+#else
SETi( left - right );
+#endif
RETURN;
}
}
@@ -2800,10 +2825,10 @@
tryAMAGICun_MG(neg_amg, 0);
if (S_negate_string(aTHX)) return NORMAL;
{
- SV * const sv = TOPs;
- IV const i = SvIV_nomg(sv);
- SETi(-i);
- return NORMAL;
+ SV * const sv = TOPs;
+ IV const i = SvIV_nomg(sv);
+ SETi( -(unsigned long int)i );
+ return NORMAL;
}
}
Other arch's will need similar code.
Be sure to include Intel ICC (__INTEL_COMPILER
) in the work-arounds. ICC is ruthless and removes undefined behavior every chance it gets. The compiler is happy to turn a functioning program into a non-functioning one. (Speaking from experience as someone who had 1<<31
silently removed during compilation. I eventually found it using UBsan).
On Fri, 31 Jan 2020 at 04:57, Jeffrey Walton notifications@github.com wrote:
This appears to clear the undefined behavior in the shifts for the 5.30.1 release tarball.
I hit it with a sledge hammer using sed. Confirming to a coding standard was not a priority for me.
sed -i 's/1 <</1U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/2 <</2U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/3 <</3U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/1L <</1U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/2L <</2U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/3L <</3U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
Shouldn't the L cases be converted to UL? Not U? If those are 64 bit ops it will break no?
sed -i 's/1<</1U<</g' op.c sed -i 's/1 <</1U <</g' op.c
sed -i 's/1<</1U<</g' regcomp.c sed -i 's/1 <</1U <</g' regcomp.c
sed -i 's/1<</1U<</g' vms/vms.c sed -i 's/1 <</1U <</g' vms/vms.c
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17508?email_source=notifications&email_token=AAAZ5R3VV62QTD57MSMOSVDRAOORPA5CNFSM4KNUPNQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKNM25I#issuecomment-580570485, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R2DGIDGW7YCPPE2IDDRAOORPANCNFSM4KNUPNQQ .
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Fri, 31 Jan 2020 at 08:37, Jeffrey Walton notifications@github.com wrote:
@jkeenan https://github.com/jkeenan,
[the behavior] is documented to give whatever overflow behaviour your computer gives at the C level.
Ouch... That creates some extra work for the Perl maintainers. To avoid the undefined behavior in C but preserve existing behavior you have to use inline ASM. ASM does not suffer C rules.
I feel like there is some missing context here. Can someone post it please?
Here's what it looks like on x86_64 using GCC compatible compilers. The patch passes self-tests using UBsan.
--- pp.c +++ pp.c @@ -32,6 +32,13 @@
include "reentr.h"
include "regcharclass.h"
+/* Inline ASM due to testing with UBsan and undefined behavior. ASM does
- not suffer C rules. See https://github.com/Perl/perl5/issues/17508.
- */ +#if defined(GNUC) || defined(clang) || defined(__INTEL_COMPILER) +# define GCC_INLINE_ASM 1 +#endif
/ variations on pp_null /
PP(pp_stub) @@ -2618,7 +2625,13 @@ tryAMAGICbin_MG(mult_amg, AMGf_assign); { dPOPTOPiirl_nomg; +#if defined(GCC_INLINE_ASM) && (defined(__x86_64) || defined(amd64__))
- long long int a;
- asm ("imul %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
SETi( a ); +#else SETi( left * right ); +#endif RETURN; } } @@ -2637,7 +2650,7 @@
/ avoid FPE_INTOVF on some platforms when num is IV_MIN / if (value == -1)
- value = - num;
- value = -(unsigned long long int)num; else value = num / value; SETi(value); @@ -2693,7 +2706,13 @@ tryAMAGICbin_MG(add_amg, AMGf_assign); { dPOPTOPiirl_ul_nomg; +#if defined(GCC_INLINE_ASM) && (defined(__x86_64) || defined(amd64__))
- long long int a;
- asm ("add %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
- SETi( a ); +#else SETi( left + right ); +#endif RETURN; } } @@ -2704,7 +2723,13 @@ tryAMAGICbin_MG(subtr_amg, AMGf_assign); { dPOPTOPiirl_ul_nomg; +#if defined(GCC_INLINE_ASM) && (defined(__x86_64) || defined(amd64__))
- long long int a;
- asm ("sub %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
- SETi( a ); +#else SETi( left - right ); +#endif RETURN; } } @@ -2800,10 +2825,10 @@ tryAMAGICun_MG(neg_amg, 0); if (S_negate_string(aTHX)) return NORMAL; {
- SV * const sv = TOPs;
- IV const i = SvIV_nomg(sv);
- SETi(-i);
- return NORMAL;
- SV * const sv = TOPs;
- IV const i = SvIV_nomg(sv);
- SETi( -(unsigned long long int)i );
- return NORMAL; } }
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17508?email_source=notifications&email_token=AAAZ5R55DKKDOTCYVK7HRVTRAPIM3A5CNFSM4KNUPNQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKNZBQQ#issuecomment-580620482, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R5RXC4V7SCFY3JAHODRAPIM3ANCNFSM4KNUPNQQ .
-- perl -Mre=debug -e "/just|another|perl|hacker/"
I feel like there is some missing context here. Can someone post it please?
The first line quoted is from my original comment, not sure how it got attributed to Jim - github's quoting seems not quite perfectly reliable. The full paragraph was:
The pp_i_foo functions are for arithmetic in the scope of use integer; this is documented to give whatever overflow behaviour your computer gives at the C level. See perldoc integer.
The specific text is actually "[i]nternally, native integer arithmetic (as provided by your C compiler) is used", and I don't think we should be patching them, certainly not in the expectation that we'd have to start supporting assembler. If there are directives we could use to tell UBsan to ignore those chunks, that would be a more acceptable solution. If that is not an option we should probably skip the relevant tests under UBsan.
The 1 << var
cases are of a different nature - it's entirely plausible, at least for some of them, that they may be invoked in some situation in which the compiler will want to optimize for specific value(s) of var including the UB case, and the fix is not onerous for us to maintain.
Hugo
On Fri, Jan 31, 2020 at 03:14:43AM -0800, Hugo van der Sanden wrote:
The specific text is actually "[i]nternally, native integer arithmetic (as provided by your C compiler) is used", and I don't think we should be patching them, certainly not in the expectation that we'd have to start supporting assembler. If there are directives we could use to tell UBsan to ignore those chunks, that would be a more acceptable solution. If that is not an option we should probably skip the relevant tests under UBsan.
See the asan_ignore file in the git respository. Invoked with
-fsanitize-blacklist=pwd
/asan_ignore.
-- The optimist believes that he lives in the best of all possible worlds. As does the pessimist.
@hvds,
If there are directives we could use to tell UBsan to ignore those chunks, that would be a more acceptable solution.
Just my two cents as an outsider...
That does not really fix the problem. The problem UBsan is alerting to is undefined behavior. What that means in practice is, non-standard or non-deterministic behavior from the compiler. The compiler is free to remove undefined behavior. That means sometimes 1<<31
will work as expected, and other times 1<<31
will not work as expected. And the kicker is, it may work with an earlier version of the compiler, but fail with a later version of the same compiler.
1<<31
was used as a simple example, but the same applies to left * right
or left % right
or any other undefined behavior. As soon as analysis progresses to the point the compiler or linker can determine undefined behavior, then the code is subject to removal.
I had to include the linker in "as soon as analysis progresses to the point the compiler or linker can determine undefined behavior" because of LTO or Link Time Optimizations. The linker now optimizes at runtime, and the linker can remove code, too.
C's undefined behavior modulo Perl caused Andy Polyakov and me a lot of grief because Andy was doing something like this in Perl (with some hand waiving):
i<<32>>32
It was OK on most platforms, but it died on Solaris i86pc. It took us several days to figure out the problem was the code generated from the Perl sources on the platform. (In fact, Andy and I were just talking about it, and he checked in this commit: ab67a8e6ee2d).
Note that perl is supposed to be compiled with -fwrapv
and because of that the operations inside pp_i_*
functions don't trigger UB. The bitshifts are obviously wrong, though.
I believe both of the issues described here are fixed:
I couldn't reproduce either when testing blead@4ddebfb8085bf6bee593c51c87c2ebf3c3ab40f1
I believe both of the issues described here are fixed:
* [fc23c91](https://github.com/Perl/perl5/commit/fc23c914f8de7cb85f58830142e0164864d4601c) - regcomp.c: Fix undefined behavior * [98fca45](https://github.com/Perl/perl5/commit/98fca457abf3b2b7317c99eb4b4ce72909e201f1) - avoid signed integer overflow in "use integer" ops
I couldn't reproduce either when testing blead@4ddebfb8085bf6bee593c51c87c2ebf3c3ab40f1
@noloader, can you please test whether the commits @tonycoz pushed in April (which presumably are found in perl-5.38.0) solve your original problem? (If we don't hear from you in a reasonable amount of time, we will assume the problem is fixed and close this ticket.) Thanks.
I believe both of the issues described here are fixed:
* [fc23c91](https://github.com/Perl/perl5/commit/fc23c914f8de7cb85f58830142e0164864d4601c) - regcomp.c: Fix undefined behavior * [98fca45](https://github.com/Perl/perl5/commit/98fca457abf3b2b7317c99eb4b4ce72909e201f1) - avoid signed integer overflow in "use integer" ops
I couldn't reproduce either when testing blead@4ddebfb8085bf6bee593c51c87c2ebf3c3ab40f1
@noloader, can you please test whether the commits @tonycoz pushed in April (which presumably are found in perl-5.38.0) solve your original problem? (If we don't hear from you in a reasonable amount of time, we will assume the problem is fixed and close this ticket.) Thanks.
No response; closing.
Hi Everyone,
I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build.
CFLAGS
andCXXFLAGS
include:Then I run
make
andmake check
.It looks like there's some undefined behavior in the Perl sources:
and:
and:
The
re_comp.c:20010
failure can probably be cleared by casting to an unsigned int.1U << 31
does not suffer the UB. That will clear a fair number of findings.I'm not sure about the
pp.c:2707
andpp.c:2621
findings. Thepp.c:2707
function is below, but I don't know how to clear it:The
pp.c:2621
function ispp_i_multiply
, and I don't know how to clear it either.The findings are easy enough to duplicate when using UBsan.
It may be a good idea to add Asan, Msan and UBsan testing to the Perl CI pipeline.