Open p5pRT opened 6 years ago
This is a bug report for perl from mike@nrdvana.net\, generated with the help of perlbug 1.40 running under perl 5.26.1.
I've run into a bug seemingly in every perl version (at least 5.12 through 5.26) where 1) create a file handle on top of a scalar\, for reading 2) set a wide char unicode layer (16-bit\, 32-bit) on the file handle 3) read from the file handle causes the byte at the current position to be set to zero. The scalar should never be modified on a read-only file handle.
Example: https://gist.github.com/nrdvana/fe01eeda2e325d825ca811267bd349ff
use strict; use warnings; use autodie; sub hexdump { join ' '\, map sprintf("%02X"\, ord $_)\, split //\, shift } my ($in_fh\, $input\, $buf1);
print "utf-16-le\n\n";
$input= "\xFF\xFE\x11\x22\x33\x44\x55\x66"; open($in_fh\, "\<"\, \$input); print "after open input=".hexdump($input)."\n"; binmode($in_fh\, ":encoding(utf-16-le)"); print "after binmode input=".hexdump($input)."\n"; read($in_fh\, $buf1\, 1); print "after read input=".hexdump($input) ." buf1=".hexdump($buf1)."\n";
print "\nutf-16-be\n\n";
$input= "\xFE\xFF\x11\x22\x33\x44\x55\x66"; open($in_fh\, "\<"\, \$input); print "after open input=".hexdump($input)."\n"; binmode($in_fh\, ":encoding(utf-16-be)"); print "after binmode input=".hexdump($input)."\n"; read($in_fh\, $buf1\, 1); print "after read input=".hexdump($input) ." buf1= ".hexdump($buf1)."\n";
print "\nutf-32-le\n\n";
$input= "\xFF\xFE\x00\x00\x33\x44\x00\x00"; open($in_fh\, "\<"\, \$input); print "after open input=".hexdump($input)."\n"; binmode($in_fh\, ":encoding(utf-32-le)"); print "after binmode input=".hexdump($input)."\n"; read($in_fh\, $buf1\, 1); print "after read input=".hexdump($input) ." buf1= ".hexdump($buf1)."\n";
Output --------------------------
utf-16-le
after open input=FF FE 11 22 33 44 55 66 after binmode input=FF FE 11 22 33 44 55 66 after read input=00 FE 11 22 33 44 55 66 buf1= FEFF
utf-16-be
after open input=FE FF 11 22 33 44 55 66 after binmode input=FE FF 11 22 33 44 55 66 after read input=00 FF 11 22 33 44 55 66 buf1= FEFF
utf-32-le
after open input=FF FE 00 00 33 44 00 00 after binmode input=FF FE 00 00 33 44 00 00 after read input=00 FE 00 00 33 44 00 00 buf1= FEFF
Flags: category=core severity=medium
Site configuration information for perl 5.26.1:
Configured by builduser at Fri Jan 5 02:49:35 UTC 2018.
Summary of my perl5 (revision 5 version 26 subversion 1) configuration: Platform: osname=linux osvers=4.14.11-1-arch archname=x86_64-linux-thread-multi uname='linux felix 4.14.11-1-arch #1 smp preempt wed jan 3 07:02:42 utc 2018 x86_64 gnulinux ' config_args='-des -Dusethreads -Duseshrplib -Doptimize=-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -Dprefix=/usr -Dvendorprefix=/usr -Dprivlib=/usr/share/perl5/core_perl -Darchlib=/usr/lib/perl5/5.26/core_perl -Dsitelib=/usr/share/perl5/site_perl -Dsitearch=/usr/lib/perl5/5.26/site_perl -Dvendorlib=/usr/share/perl5/vendor_perl -Dvendorarch=/usr/lib/perl5/5.26/vendor_perl -Dscriptdir=/usr/bin/core_perl -Dsitescript=/usr/bin/site_perl -Dvendorscript=/usr/bin/vendor_perl -Dinc_version_list=none -Dman1ext=1perl -Dman3ext=3perl -Dcccdlflags='-fPIC' -Dlddlflags=-shared -Wl\,-O1\,--sort-common\,--as-needed\,-z\,relro\,-z\,now -Dldflags=-Wl\,-O1\,--sort-common\,--as-needed\,-z\,relro\,-z\,now' hint=recommended useposix=true d_sigaction=define useithreads=define usemultiplicity=define use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n default_inc_excludes_dot=define bincompat5005=undef Compiler: cc='cc' ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2' optimize='-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt' cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' ccversion='' gccversion='7.2.1 20171224' gccosandvers='' intsize=4 longsize=8 ptrsize=8 doublesize=8 byteorder=12345678 doublekind=3 d_longlong=define longlongsize=8 d_longdbl=define longdblsize=16 longdblkind=3 ivtype='long' ivsize=8 nvtype='double' nvsize=8 Off_t='off_t' lseeksize=8 alignbytes=8 prototype=define Linker and Libraries: ld='cc' ldflags ='-Wl\,-O1\,--sort-common\,--as-needed\,-z\,relro\,-z\,now -fstack-protector-strong -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/x86_64-pc-linux-gnu/7.2.1/include-fixed /usr/lib /lib/../lib /usr/lib/../lib /lib /lib64 /usr/lib64 libs=-lpthread -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.26.so so=so useshrplib=true libperl=libperl.so gnulibc_version='2.26' Dynamic Linking: dlsrc=dl_dlopen.xs dlext=so d_dlsymun=undef ccdlflags='-Wl\,-E -Wl\,-rpath\,/usr/lib/perl5/5.26/core_perl/CORE' cccdlflags='-fPIC' lddlflags='-shared -Wl\,-O1\,--sort-common\,--as-needed\,-z\,relro\,-z\,now -L/usr/local/lib -fstack-protector-strong'
@INC for perl 5.26.1: /usr/lib/perl5/5.26/site_perl /usr/share/perl5/site_perl /usr/lib/perl5/5.26/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5/5.26/core_perl /usr/share/perl5/core_perl
(pardon for trimming the environment\, but it had things I didn't want to publish)
On Wed\, 07 Mar 2018 23:16:28 -0800\, mike@nrdvana.net wrote:
I've run into a bug seemingly in every perl version (at least 5.12 through 5.26) where 1) create a file handle on top of a scalar\, for reading 2) set a wide char unicode layer (16-bit\, 32-bit) on the file handle 3) read from the file handle causes the byte at the current position to be set to zero. The scalar should never be modified on a read-only file handle.
Example: https://gist.github.com/nrdvana/fe01eeda2e325d825ca811267bd349ff
use strict; use warnings; use autodie; sub hexdump { join ' '\, map sprintf("%02X"\, ord $_)\, split //\, shift } my ($in_fh\, $input\, $buf1);
print "utf-16-le\n\n";
$input= "\xFF\xFE\x11\x22\x33\x44\x55\x66"; open($in_fh\, "\<"\, \$input); print "after open input=".hexdump($input)."\n"; binmode($in_fh\, ":encoding(utf-16-le)"); print "after binmode input=".hexdump($input)."\n"; read($in_fh\, $buf1\, 1); print "after read input=".hexdump($input) ." buf1=".hexdump($buf1)."\n";
print "\nutf-16-be\n\n";
$input= "\xFE\xFF\x11\x22\x33\x44\x55\x66"; open($in_fh\, "\<"\, \$input); print "after open input=".hexdump($input)."\n"; binmode($in_fh\, ":encoding(utf-16-be)"); print "after binmode input=".hexdump($input)."\n"; read($in_fh\, $buf1\, 1); print "after read input=".hexdump($input) ." buf1= ".hexdump($buf1)."\n";
print "\nutf-32-le\n\n";
$input= "\xFF\xFE\x00\x00\x33\x44\x00\x00"; open($in_fh\, "\<"\, \$input); print "after open input=".hexdump($input)."\n"; binmode($in_fh\, ":encoding(utf-32-le)"); print "after binmode input=".hexdump($input)."\n"; read($in_fh\, $buf1\, 1); print "after read input=".hexdump($input) ." buf1= ".hexdump($buf1)."\n";
Output --------------------------
utf-16-le
after open input=FF FE 11 22 33 44 55 66 after binmode input=FF FE 11 22 33 44 55 66 after read input=00 FE 11 22 33 44 55 66 buf1= FEFF
utf-16-be
after open input=FE FF 11 22 33 44 55 66 after binmode input=FE FF 11 22 33 44 55 66 after read input=00 FF 11 22 33 44 55 66 buf1= FEFF
utf-32-le
after open input=FF FE 00 00 33 44 00 00 after binmode input=FF FE 00 00 33 44 00 00 after read input=00 FE 00 00 33 44 00 00 buf1= FEFF
This is a duplicate of #132833\, which was fixed in fed9fe5b48ccdffef9065a03c12c237cc7418de6.
I don't see this commit in the 5.26 votes file.
Tony
The RT System itself - Status changed from 'new' to 'open'
Migrated from rt.perl.org#132949 (status was 'open')
Searchable as RT132949$