Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.96k stars 555 forks source link

UBsan findings in Perl 5.30.1 #17508

Closed noloader closed 11 months ago

noloader commented 4 years ago

Hi Everyone,

I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build. CFLAGS and CXXFLAGS include:

-fsanitize=undefined -fno-sanitize-recover

Then I run make and make check.

It looks like there's some undefined behavior in the Perl sources:

re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
# Failed test 2258 - Verify compilation of
[\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}]

and:

lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow:
-9223372036854775808 - 1 cannot be represented in type 'long int'
FAILED--expected 15 tests, saw 0

and:

dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int'
FAILED--expected 18 tests, saw 12

The re_comp.c:20010 failure can probably be cleared by casting to an unsigned int. 1U << 31 does not suffer the UB. That will clear a fair number of findings.

I'm not sure about the pp.c:2707 and pp.c:2621 findings. The pp.c:2707 function is below, but I don't know how to clear it:

PP(pp_i_subtract)
{
    dSP; dATARGET;
    tryAMAGICbin_MG(subtr_amg, AMGf_assign);
    {
      dPOPTOPiirl_ul_nomg;
      SETi( left - right );
      RETURN;
    }
}

The pp.c:2621 function is pp_i_multiply, and I don't know how to clear it either.

The findings are easy enough to duplicate when using UBsan.

It may be a good idea to add Asan, Msan and UBsan testing to the Perl CI pipeline.

hvds commented 4 years ago

I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build. CFLAGS and CXXFLAGS include:

-fsanitize=undefined -fno-sanitize-recover

Then I run make and make check.

It looks like there's some undefined behavior in the Perl sources:

re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
# Failed test 2258 - Verify compilation of
[\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}]

Hmm, the test here is: if (flags & (1<<bit)), but flags is declared as U32 which should surely be represented by an unsigned int. This could be a deeper problem in your build.

lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow:
-9223372036854775808 - 1 cannot be represented in type 'long int'
FAILED--expected 15 tests, saw 0

and:

dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int'
FAILED--expected 18 tests, saw 12

The pp_i_foo functions are for arithmetic in the scope of use integer; this is documented to give whatever overflow behaviour your computer gives at the C level. See perldoc integer.

Hugo

jkeenan commented 4 years ago

On 1/30/20 6:42 AM, Jeffrey Walton wrote:

Hi Everyone,

I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build.

|CFLAGS| and |CXXFLAGS| include:

|-fsanitize=undefined -fno-sanitize-recover |

It looks like there's some undefined behavior in the Perl sources:

|re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' # Failed test 2258 - Verify compilation of [\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}] |

and:

|lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long int' FAILED--expected 15 tests, saw 0 |

and:

|dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int' FAILED--expected 18 tests, saw 12 |

The |re_comp.c:20010| failure can probably be cleared by casting to an unsigned int. |1U << 31| does not suffer the UB. That will clear a fair number of findings.

I'm not sure about the |pp.c:2707| and |pp.c:2621| findings.

There are other failures. They are easy enough to duplicate when using UBsan.

It may be a good idea to add Asan, Msan and UBsan testing to the Perl CI pipeline.

I believe that this is the first time that UBsan has been mentioned on this list. Is the material in the following links reliable?

https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

https://www.kernel.org/doc/html/latest/dev-tools/ubsan.html

Is UBsan relevant for compilers other than gcc and g++?

Thank you very much. Jim Keenan

jkeenan commented 4 years ago

On 1/30/20 6:42 AM, Jeffrey Walton wrote:

Hi Everyone,

I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build.

|CFLAGS| and |CXXFLAGS| include:

|-fsanitize=undefined -fno-sanitize-recover |

Can you provide either (a) the complete invocation to ./Configure you are using to generate this build; or (b) attach the output of './perl -Ilib -V' as a text file?

It looks like there's some undefined behavior in the Perl sources:

|re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' # Failed test 2258 - Verify compilation of [\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}] |

and:

|lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long int' FAILED--expected 15 tests, saw 0 |

and:

|dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int' FAILED--expected 18 tests, saw 12 |

The |re_comp.c:20010| failure can probably be cleared by casting to an unsigned int. |1U << 31| does not suffer the UB. That will clear a fair number of findings.

I'm not sure about the |pp.c:2707| and |pp.c:2621| findings.

There are other failures. They are easy enough to duplicate when using UBsan.

It may be a good idea to add Asan, Msan and UBsan testing to the Perl CI pipeline.

Thank you very much. Jim Keenan

jkeenan commented 4 years ago

On 1/30/20 8:00 AM, James E Keenan wrote:

On 1/30/20 6:42 AM, Jeffrey Walton wrote:

Hi Everyone,

I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build.

|CFLAGS| and |CXXFLAGS| include:

|-fsanitize=undefined -fno-sanitize-recover |

It looks like there's some undefined behavior in the Perl sources:

|re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' # Failed test 2258 - Verify compilation of [\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}] |

and:

|lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long int' FAILED--expected 15 tests, saw 0 |

and:

|dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int' FAILED--expected 18 tests, saw 12 |

The |re_comp.c:20010| failure can probably be cleared by casting to an unsigned int. |1U << 31| does not suffer the UB. That will clear a fair number of findings.

I'm not sure about the |pp.c:2707| and |pp.c:2621| findings.

There are other failures. They are easy enough to duplicate when using UBsan.

It may be a good idea to add Asan, Msan and UBsan testing to the Perl CI pipeline.

I believe that this is the first time that UBsan has been mentioned on this list. Is the material in the following links reliable?

https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

https://www.kernel.org/doc/html/latest/dev-tools/ubsan.html

At the above URL I read:

"To enable UBSAN configure kernel with: CONFIG_UBSAN=y"

So, to use UBsan, is one required to recompile the entire Linux kernel? That doesn't sound feasible for most developers.

Also, if I were to install a Debian package for this functionality (or a pkg for *BSD), which should I use?

[snip]

Thank you very much. Jim Keenan

noloader commented 4 years ago

@jkeenan ,

I believe that this is the first time that UBsan has been mentioned on this list. Is the material in the following links reliable?

https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

Yes.

To use the sanitizers you just add the appropriate compiler options to CFLAGS and CXXFLAGS. The support is baked into some compilers. There is not much too it, other than a custom build using extra CFLAGS and CXXFLAGS.

https://www.kernel.org/doc/html/latest/dev-tools/ubsan.html

No.

Is UBsan relevant for compilers other than gcc and g++?

Yes. Clang also provides them. I'm not sure about Intel's ICC or other compilers, like IBM XL C/C++. (IBM's compiler on Linux now uses LLVM's front-end).

It is usually enough to test with GCC or Clang. The other compilers receive the herd immunity.

noloader commented 4 years ago

@jkeenan,

Can you provide either (a) the complete invocation to ./Configure you are using to generate this build; or (b) attach the output of './perl -Ilib -V' as a text file?

My Configure for UBsan looks like this. I used /home/jwalton/tmp/ubsan as prefix.

PERL_PKGCONFIG: /home/jwalton/tmp/ubsan/lib/pkgconfig
 PERL_CPPFLAGS: -I/home/jwalton/tmp/ubsan/include -DNDEBUG -DTEST_UBSAN=1
   PERL_CFLAGS: -g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread
 PERL_CXXFLAGS: -g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread
  PERL_LDFLAGS: -L/home/jwalton/tmp/ubsan/lib -fsanitize=undefined -fno-sanitize-recover -Wl,-R,'$$ORIGIN/../lib' -Wl,-R,/home/jwalton/tmp/ubsan/lib -Wl,--enable-new-dtags

if ! ./Configure -des \
     -Dprefix="$TEST_PREFIX" \
     -Dlibdir="$TEST_LIBDIR" \
     -Dpkgconfig="$PERL_PKGCONFIG" \
     -Acppflags="$PERL_CPPFLAGS" \
     -Accflags="$PERL_CFLAGS" \
     -Aldflags="$PERL_LDFLAGS" \
     -Dextras="Test::More Text::Template"
then
    echo "Failed to configure Perl"
    exit 1
fi

And:

$ "$HOME/tmp/ubsan/bin/perl" -Ilib -V
Summary of my perl5 (revision 5 version 30 subversion 1) configuration:

  Platform:
    osname=linux
    osvers=5.0.0-37-generic
    archname=x86_64-linux
    uname='linux coffee 5.0.0-37-generic #40~18.04.1-ubuntu smp thu nov 14 12:06:39 utc 2019 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Dprefix=/home/jwalton/tmp/ubsan -Dlibdir=/home/jwalton/tmp/ubsan/lib -Dpkgconfig=/home/jwalton/tmp/ubsan/lib/pkgconfig -Acppflags=-I/home/jwalton/tmp/ubsan/include -DNDEBUG -DTEST_UBSAN=1 -Accflags=-g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread -Aldflags=-L/home/jwalton/tmp/ubsan/lib -fsanitize=undefined -fno-sanitize-recover -Wl,-R,'$$ORIGIN/../lib' -Wl,-R,/home/jwalton/tmp/ubsan/lib -Wl,--enable-new-dtags -Dextras=Test::More Text::Template'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='cc'
    ccflags ='-g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-O2'
    cppflags='-I/home/jwalton/tmp/ubsan/include -DNDEBUG -DTEST_UBSAN=1 -g2 -O2 -fsanitize=undefined -fno-sanitize-recover -march=native -fPIC -pthread -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='7.4.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -L/home/jwalton/tmp/ubsan/lib -fsanitize=undefined -fno-sanitize-recover -Wl,-R,21651ORIGIN/../lib -Wl,-R,/home/jwalton/tmp/ubsan/lib -Wl,--enable-new-dtags -fstack-protector-strong -L/usr/local/lib'
    libpth=/home/jwalton/tmp/ubsan/lib /usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
    libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.27.so
    so=so
    useshrplib=false
    libperl=libperl.a
    gnulibc_version='2.27'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -L/home/jwalton/tmp/ubsan/lib -Wl,-R,21651ORIGIN/../lib -Wl,-R,/home/jwalton/tmp/ubsan/lib -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_TIMES
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under linux
  Compiled at Jan 30 2020 06:48:11
  @INC:
    lib
    /home/jwalton/tmp/ubsan/lib/perl5/site_perl/5.30.1/x86_64-linux
    /home/jwalton/tmp/ubsan/lib/perl5/site_perl/5.30.1
    /home/jwalton/tmp/ubsan/lib/perl5/5.30.1/x86_64-linux
    /home/jwalton/tmp/ubsan/lib/perl5/5.30.1
noloader commented 4 years ago

@jkeenan,

This GitHub may be helpful for you: Noloader | Build-Scripts. I use them for installing modern tools on old systems. I also use them for isolated Sanitizer testing.

If you clone the repo on a machine with modern build tools and modern wget, then you should only need to perform the following. The heavy lifting is performed by the scripts. It builds dependencies as needed, and sets runtime paths so you don't need those stupid LD_LIBRARY_PATHs to run your program.

INSTX_UBSAN=1 INSTX_PREFIX=$HOME/tmp/ubsan ./build-perl.sh

The scripts always use the latest release tarball.

In Perl's case, the script can use Master by changing build-perl.sh to perform a git clone rather than a wget fetch. You have to do that by hand.

demerphq commented 4 years ago

On Thu, 30 Jan 2020, 20:58 Hugo van der Sanden, notifications@github.com wrote:

I'm testing Perl 5.30.1 release tarball on Ubuntu 18.04 x86_64 fully patched. I'm performing a UBsan build. CFLAGS and CXXFLAGS include:

-fsanitize=undefined -fno-sanitize-recover

Then I run make and make check.

It looks like there's some undefined behavior in the Perl sources:

re_comp.c:20010:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'

Failed test 2258 - Verify compilation of

[\x{10C}-{INFTY}\x{102}-\x{104}\x{108}-\x{10A}\x{107}]

Hmm, the test here is: if (flags & (1<<bit)), but flags is declared as U32 which should surely be represented by an unsigned int. This could be a deeper problem in your build.

The problem is the type of 1 is ambiguous and some compilers will assume a signed int. It should be 1U as Jeffery stated.

Yves

lib/integer ....................................................pp.c:2707:7: runtime error: signed integer overflow:

-9223372036854775808 - 1 cannot be represented in type 'long int' FAILED--expected 15 tests, saw 0

and:

dist/if/t/if ...................................................pp.c:2621:7: runtime error: signed integer overflow: 9999999999 * 10000000000 cannot be represented in type 'long int' FAILED--expected 18 tests, saw 12

The pp_i_foo functions are for arithmetic in the scope of use integer; this is documented to give whatever overflow behaviour your computer gives at the C level. See perldoc integer.

Hugo

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17508?email_source=notifications&email_token=AAAZ5R2SH25KBERSNAYJWNDRALFJBA5CNFSM4KNUPNQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKK4LVQ#issuecomment-580240854, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5RYWXIYLT4ENWWJAGV3RALFJBANCNFSM4KNUPNQQ .

hvds commented 4 years ago

The problem is the type of 1 is ambiguous and some compilers will assume a signed int. It should be 1U as Jeffery stated.

Ah good point, my mistake.

Hugo

noloader commented 4 years ago

This appears to clear the undefined behavior in the shifts for the 5.30.1 release tarball.

I hit it with a sledge hammer using sed. Confirming to a coding standard was not a priority for me.

sed -i 's/1 <</1U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/2 <</2U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/3 <</3U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/1L <</1U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/2L <</2U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c
sed -i 's/3L <</3U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c

sed -i 's/1<</1U<</g' op.c
sed -i 's/1 <</1U <</g' op.c

sed -i 's/1<</1U<</g' regcomp.c
sed -i 's/1 <</1U <</g' regcomp.c

sed -i 's/1<</1U<</g' vms/vms.c
sed -i 's/1 <</1U <</g' vms/vms.c

Here's the resulting patch I am using: perl.patch.

I'm going to work on the integer problems next.

noloader commented 4 years ago

@jkeenan,

[the behavior] is documented to give whatever overflow behaviour your computer gives at the C level.

Ouch... That creates some extra work for the Perl maintainers. To avoid the undefined behavior in C but preserve existing behavior you have to use inline ASM. ASM does not suffer C rules.

Here's what it looks like on x86_64 using GCC compatible compilers. The patch passes self-tests using UBsan.

--- pp.c
+++ pp.c
@@ -32,6 +32,13 @@
 #include "reentr.h"
 #include "regcharclass.h"

+/* Inline ASM due to testing with UBsan and undefined behavior. ASM does
+ * not suffer C rules. See https://github.com/Perl/perl5/issues/17508.
+ */
+#if defined(__GNUC__) || defined(__clang__) || defined(__INTEL_COMPILER)
+# define GCC_INLINE_ASM 1
+#endif
+
 /* variations on pp_null */

 PP(pp_stub)
@@ -2618,7 +2625,13 @@
     tryAMAGICbin_MG(mult_amg, AMGf_assign);
     {
       dPOPTOPiirl_nomg;
+#if defined(GCC_INLINE_ASM) && (defined(__x86_64__) || defined(__amd64__))
+      long int a;
+      __asm__ ("imul %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
+      SETi( a );
+#else
       SETi( left * right );
+#endif
       RETURN;
     }
 }
@@ -2637,7 +2650,7 @@

       /* avoid FPE_INTOVF on some platforms when num is IV_MIN */
       if (value == -1)
-          value = - num;
+          value = -(unsigned long int)num;
       else
           value = num / value;
       SETi(value);
@@ -2693,7 +2706,13 @@
     tryAMAGICbin_MG(add_amg, AMGf_assign);
     {
       dPOPTOPiirl_ul_nomg;
+#if defined(GCC_INLINE_ASM) && (defined(__x86_64__) || defined(__amd64__))
+      long int a;
+      __asm__ ("add %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
+      SETi( a );
+#else
       SETi( left + right );
+#endif
       RETURN;
     }
 }
@@ -2704,7 +2723,13 @@
     tryAMAGICbin_MG(subtr_amg, AMGf_assign);
     {
       dPOPTOPiirl_ul_nomg;
+#if defined(GCC_INLINE_ASM) && (defined(__x86_64__) || defined(__amd64__))
+      long int a;
+      __asm__ ("sub %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
+      SETi( a );
+#else
       SETi( left - right );
+#endif
       RETURN;
     }
 }
@@ -2800,10 +2825,10 @@
     tryAMAGICun_MG(neg_amg, 0);
     if (S_negate_string(aTHX)) return NORMAL;
     {
-   SV * const sv = TOPs;
-   IV const i = SvIV_nomg(sv);
-   SETi(-i);
-   return NORMAL;
+   SV * const sv = TOPs;
+   IV const i = SvIV_nomg(sv);
+   SETi( -(unsigned long int)i );
+   return NORMAL;
     }
 }

Other arch's will need similar code.

Be sure to include Intel ICC (__INTEL_COMPILER) in the work-arounds. ICC is ruthless and removes undefined behavior every chance it gets. The compiler is happy to turn a functioning program into a non-functioning one. (Speaking from experience as someone who had 1<<31 silently removed during compilation. I eventually found it using UBsan).

demerphq commented 4 years ago

On Fri, 31 Jan 2020 at 04:57, Jeffrey Walton notifications@github.com wrote:

This appears to clear the undefined behavior in the shifts for the 5.30.1 release tarball.

I hit it with a sledge hammer using sed. Confirming to a coding standard was not a priority for me.

sed -i 's/1 <</1U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/2 <</2U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/3 <</3U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/1L <</1U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/2L <</2U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c sed -i 's/3L <</3U <</g' cpan/Compress-Raw-Zlib/zlib-src/zutil.c

Shouldn't the L cases be converted to UL? Not U? If those are 64 bit ops it will break no?

sed -i 's/1<</1U<</g' op.c sed -i 's/1 <</1U <</g' op.c

sed -i 's/1<</1U<</g' regcomp.c sed -i 's/1 <</1U <</g' regcomp.c

sed -i 's/1<</1U<</g' vms/vms.c sed -i 's/1 <</1U <</g' vms/vms.c

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17508?email_source=notifications&email_token=AAAZ5R3VV62QTD57MSMOSVDRAOORPA5CNFSM4KNUPNQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKNM25I#issuecomment-580570485, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R2DGIDGW7YCPPE2IDDRAOORPANCNFSM4KNUPNQQ .

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq commented 4 years ago

On Fri, 31 Jan 2020 at 08:37, Jeffrey Walton notifications@github.com wrote:

@jkeenan https://github.com/jkeenan,

[the behavior] is documented to give whatever overflow behaviour your computer gives at the C level.

Ouch... That creates some extra work for the Perl maintainers. To avoid the undefined behavior in C but preserve existing behavior you have to use inline ASM. ASM does not suffer C rules.

I feel like there is some missing context here. Can someone post it please?

Here's what it looks like on x86_64 using GCC compatible compilers. The patch passes self-tests using UBsan.

--- pp.c +++ pp.c @@ -32,6 +32,13 @@

include "reentr.h"

include "regcharclass.h"

+/* Inline ASM due to testing with UBsan and undefined behavior. ASM does

  • */ +#if defined(GNUC) || defined(clang) || defined(__INTEL_COMPILER) +# define GCC_INLINE_ASM 1 +#endif
  • / variations on pp_null /

    PP(pp_stub) @@ -2618,7 +2625,13 @@ tryAMAGICbin_MG(mult_amg, AMGf_assign); { dPOPTOPiirl_nomg; +#if defined(GCC_INLINE_ASM) && (defined(__x86_64) || defined(amd64__))

  • long long int a;
  • asm ("imul %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
  • SETi( a ); +#else SETi( left * right ); +#endif RETURN; } } @@ -2637,7 +2650,7 @@

    / avoid FPE_INTOVF on some platforms when num is IV_MIN / if (value == -1)

  • value = - num;
  • value = -(unsigned long long int)num; else value = num / value; SETi(value); @@ -2693,7 +2706,13 @@ tryAMAGICbin_MG(add_amg, AMGf_assign); { dPOPTOPiirl_ul_nomg; +#if defined(GCC_INLINE_ASM) && (defined(__x86_64) || defined(amd64__))
  • long long int a;
  • asm ("add %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
  • SETi( a ); +#else SETi( left + right ); +#endif RETURN; } } @@ -2704,7 +2723,13 @@ tryAMAGICbin_MG(subtr_amg, AMGf_assign); { dPOPTOPiirl_ul_nomg; +#if defined(GCC_INLINE_ASM) && (defined(__x86_64) || defined(amd64__))
  • long long int a;
  • asm ("sub %%rcx, %%rax;" : "=a" (a) : "a" (left), "c" (right) : "cc" );
  • SETi( a ); +#else SETi( left - right ); +#endif RETURN; } } @@ -2800,10 +2825,10 @@ tryAMAGICun_MG(neg_amg, 0); if (S_negate_string(aTHX)) return NORMAL; {
  • SV * const sv = TOPs;
  • IV const i = SvIV_nomg(sv);
  • SETi(-i);
  • return NORMAL;
  • SV * const sv = TOPs;
  • IV const i = SvIV_nomg(sv);
  • SETi( -(unsigned long long int)i );
  • return NORMAL; } }

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17508?email_source=notifications&email_token=AAAZ5R55DKKDOTCYVK7HRVTRAPIM3A5CNFSM4KNUPNQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKNZBQQ#issuecomment-580620482, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R5RXC4V7SCFY3JAHODRAPIM3ANCNFSM4KNUPNQQ .

-- perl -Mre=debug -e "/just|another|perl|hacker/"

hvds commented 4 years ago

I feel like there is some missing context here. Can someone post it please?

The first line quoted is from my original comment, not sure how it got attributed to Jim - github's quoting seems not quite perfectly reliable. The full paragraph was:

The pp_i_foo functions are for arithmetic in the scope of use integer; this is documented to give whatever overflow behaviour your computer gives at the C level. See perldoc integer.

The specific text is actually "[i]nternally, native integer arithmetic (as provided by your C compiler) is used", and I don't think we should be patching them, certainly not in the expectation that we'd have to start supporting assembler. If there are directives we could use to tell UBsan to ignore those chunks, that would be a more acceptable solution. If that is not an option we should probably skip the relevant tests under UBsan.

The 1 << var cases are of a different nature - it's entirely plausible, at least for some of them, that they may be invoked in some situation in which the compiler will want to optimize for specific value(s) of var including the UB case, and the fix is not onerous for us to maintain.

Hugo

iabyn commented 4 years ago

On Fri, Jan 31, 2020 at 03:14:43AM -0800, Hugo van der Sanden wrote:

The specific text is actually "[i]nternally, native integer arithmetic (as provided by your C compiler) is used", and I don't think we should be patching them, certainly not in the expectation that we'd have to start supporting assembler. If there are directives we could use to tell UBsan to ignore those chunks, that would be a more acceptable solution. If that is not an option we should probably skip the relevant tests under UBsan.

See the asan_ignore file in the git respository. Invoked with -fsanitize-blacklist=pwd/asan_ignore.

-- The optimist believes that he lives in the best of all possible worlds. As does the pessimist.

noloader commented 4 years ago

@hvds,

If there are directives we could use to tell UBsan to ignore those chunks, that would be a more acceptable solution.

Just my two cents as an outsider...

That does not really fix the problem. The problem UBsan is alerting to is undefined behavior. What that means in practice is, non-standard or non-deterministic behavior from the compiler. The compiler is free to remove undefined behavior. That means sometimes 1<<31 will work as expected, and other times 1<<31 will not work as expected. And the kicker is, it may work with an earlier version of the compiler, but fail with a later version of the same compiler.

1<<31 was used as a simple example, but the same applies to left * right or left % right or any other undefined behavior. As soon as analysis progresses to the point the compiler or linker can determine undefined behavior, then the code is subject to removal.

I had to include the linker in "as soon as analysis progresses to the point the compiler or linker can determine undefined behavior" because of LTO or Link Time Optimizations. The linker now optimizes at runtime, and the linker can remove code, too.

C's undefined behavior modulo Perl caused Andy Polyakov and me a lot of grief because Andy was doing something like this in Perl (with some hand waiving):

i<<32>>32

It was OK on most platforms, but it died on Solaris i86pc. It took us several days to figure out the problem was the code generated from the Perl sources on the platform. (In fact, Andy and I were just talking about it, and he checked in this commit: ab67a8e6ee2d).

xenu commented 4 years ago

Note that perl is supposed to be compiled with -fwrapv and because of that the operations inside pp_i_* functions don't trigger UB. The bitshifts are obviously wrong, though.

tonycoz commented 1 year ago

I believe both of the issues described here are fixed:

I couldn't reproduce either when testing blead@4ddebfb8085bf6bee593c51c87c2ebf3c3ab40f1

jkeenan commented 1 year ago

I believe both of the issues described here are fixed:

* [fc23c91](https://github.com/Perl/perl5/commit/fc23c914f8de7cb85f58830142e0164864d4601c) - regcomp.c: Fix undefined behavior

* [98fca45](https://github.com/Perl/perl5/commit/98fca457abf3b2b7317c99eb4b4ce72909e201f1) - avoid signed integer overflow in "use integer" ops

I couldn't reproduce either when testing blead@4ddebfb8085bf6bee593c51c87c2ebf3c3ab40f1

@noloader, can you please test whether the commits @tonycoz pushed in April (which presumably are found in perl-5.38.0) solve your original problem? (If we don't hear from you in a reasonable amount of time, we will assume the problem is fixed and close this ticket.) Thanks.

jkeenan commented 11 months ago

I believe both of the issues described here are fixed:

* [fc23c91](https://github.com/Perl/perl5/commit/fc23c914f8de7cb85f58830142e0164864d4601c) - regcomp.c: Fix undefined behavior

* [98fca45](https://github.com/Perl/perl5/commit/98fca457abf3b2b7317c99eb4b4ce72909e201f1) - avoid signed integer overflow in "use integer" ops

I couldn't reproduce either when testing blead@4ddebfb8085bf6bee593c51c87c2ebf3c3ab40f1

@noloader, can you please test whether the commits @tonycoz pushed in April (which presumably are found in perl-5.38.0) solve your original problem? (If we don't hear from you in a reasonable amount of time, we will assume the problem is fixed and close this ticket.) Thanks.

No response; closing.