Closed p5pRT closed 6 years ago
This is a bug report for perl from\, generated with the help of perlbug 1.40 running under perl 5.26.1.
`perlfunc -f sysread` says using :utf8 handles are perfectly okay:
Note that if the filehandle has been marked as ":utf8"\, Unicode characters are read instead of bytes (the LENGTH\, OFFSET\, and the return value of "sysread" are in Unicode characters). The ":encoding(...)" layer implicitly introduces the ":utf8" layer. See "binmode"\, "open"\, and the open pragma.
However doing so provikes this at run time:
sysread() is deprecated on :utf8 handles. This will be a fatal error in Perl 5.30
Suggest changing the documentation to say that this feature is deprecated\, so people don't waste time writing code which will become wrong later.
Flags: category=core severity=low
Site configuration information for perl 5.26.1:
Configured by Ubuntu at Sat Mar 10 18:40:42 UTC 2018.
Summary of my perl5 (revision 5 version 26 subversion 1) configuration:
uname='linux localhost 4.9.0 #1 smp debian 4.9.0 x86_64 gnulinux '
config_args='-Dusethreads -Duselargefiles -Dcc=x86_64-linux-gnu-gcc -Dcpp=x86_64-linux-gnu-cpp -Dld=x86_64-linux-gnu-gcc -Dccflags=-DDEBIAN -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=/build/perl-5CtO_8/perl-5.26.1=. -fstack-protector-strong -Wformat -Werror=format-security -Dldflags= -Wl\,-Bsymbolic-functions -Wl\,-z\,relro -Dlddlflags=-shared -Wl\,-Bsymbolic-functions -Wl\,-z\,relro -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.26 -Darchlib=/usr/lib/x86_64-linux-gnu/perl/5.26 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/x86_64-linux-gnu/perl5/5.26 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.26.1 -Dsitearch=/usr/local/lib/x86_64-linux-gnu/perl/5.26.1 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Duse64bitint
-Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -Ui_xlocale -Uversiononly -DDEBUGGING=-g -Doptimize=-O2 -dEs -Duseshrplib'
ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fwrapv -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
optimize='-O2 -g'
cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fwrapv -fno-strict-aliasing -pipe -I/usr/local/include'
Linker and Libraries:
ldflags =' -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
perllibs=-ldl -lm -lpthread -lc -lcrypt
Dynamic Linking:
lddlflags='-shared -L/usr/local/lib -fstack-protector-strong'
Locally applied patches: DEBPKG:debian/cpan_definstalldirs - Provide a sensible INSTALLDIRS default for modules installed from CPAN. DEBPKG:debian/db_file_ver - Remove overly restrictive DB_File version check. DEBPKG:debian/doc_info - Replace generic man(1) instructions with Debian-specific information. DEBPKG:debian/enc2xs_inc - Tweak enc2xs to follow symlinks and ignore missing @INC directories. DEBPKG:debian/errno_ver - Remove Errno version check due to upgrade problems with long-running processes. DEBPKG:debian/libperl_embed_doc - Note that libperl-dev package is required for embedded linking DEBPKG:fixes/respect_umask - Respect umask during installation DEBPKG:debian/writable_site_dirs - Set umask approproately for site install directories DEBPKG:debian/extutils_set_libperl_path - EU:MM: set location of libperl.a under /usr/lib DEBPKG:debian/no_packlist_perllocal - Don't install .packlist or perllocal.pod for perl or vendor DEBPKG:debian/fakeroot - Postpone LD_LIBRARY_PATH evaluation to the binary targets. DEBPKG:debian/instmodsh_doc - Debian policy doesn't install .packlist files for core or vendor. DEBPKG:debian/ld_run_path - Remove standard libs from LD_RUN_PATH as per Debian policy. DEBPKG:debian/libnet_config_path - Set location of libnet.cfg to /etc/perl/Net as /usr may not be writable. DEBPKG:debian/perlivp - Make perlivp skip include directories in /usr/local DEBPKG:debian/deprecate-with-apt - Point users to Debian packages of deprecated core modules DEBPKG:debian/squelch-locale-warnings - Squelch locale warnings in Debian package maintainer scripts DEBPKG:debian/patchlevel - List packaged patches for 5.26.1-6 in patchlevel.h DEBPKG:fixes/document_makemaker_ccflags - [ #68613] Document that CCFLAGS should include $Config{ccflags} DEBPKG:debian/find_html2text - Configure CPAN::Distribution with correct name of html2text DEBPKG:debian/perl5db-x-terminal-emulator.patch - Invoke x-terminal-emulator rather than xterm in DEBPKG:debian/cpan-missing-site-dirs - Fix CPAN::FirstTime defaults with nonexisting site dirs if a parent is writable DEBPKG:fixes/memoize_storable_nstore - [ #77790] Memoize::Storable: respect 'nstore' option not respected DEBPKG:debian/makemaker-pasthru - Pass LD settings through to subdirectories DEBPKG:debian/makemaker-manext - Make EU::MakeMaker honour MANnEXT settings in generated manpage headers DEBPKG:debian/kfreebsd-softupdates - Work around Debian Bug#796798 DEBPKG:fixes/autodie-scope - Fix a scoping issue with "no autodie" and the "system" sub DEBPKG:fixes/memoize-pod - [ #89441] Fix POD errors in Memoize DEBPKG:debian/hurd-softupdates - Fix t/op/stat.t failures on hurd DEBPKG:fixes/math_complex_doc_great_circle - [ #114104] Math::Trig: clarify definition of great_circle_midpoint DEBPKG:fixes/math_complex_doc_see_also - [ #114105] Math::Trig: add missing SEE ALSO DEBPKG:fixes/math_complex_doc_angle_units - [ #114106] Math::Trig: document angle units DEBPKG:fixes/cpan_web_link - CPAN: Add link to main CPAN web site DEBPKG:fixes/time_piece_doc - Time::Piece: Improve documentation for add_months and add_years DEBPKG:fixes/extutils_makemaker_reproducible - Make perllocal.pod files reproducible DEBPKG:fixes/file_path_hurd_errno - File-Path: Fix test failure in Hurd due to hard-coded ENOENT DEBPKG:debian/hppa_op_optimize_workaround - Temporarily lower the optimization of op.c on hppa due to gcc-6 problems DEBPKG:debian/installman-utf8 - Generate man pages with UTF-8 characters DEBPKG:fixes/file_path_chmod_race - [ #121951] Prevent directory chmod race attack. DEBPKG:fixes/extutils_file_path_compat - Correct the order of tests of chmod(). (#294) DEBPKG:fixes/getopt-long-2 - [ #120300] Withdraw part of commit 5d9947fb445327c7299d8beb009d609bc70066c0\, which tries to implement more GNU getopt_long campatibility. GNU DEBPKG:fixes/getopt-long-3 - provide a default value for optional arguments DEBPKG:fixes/getopt-long-4 - [ #122068] Fix issue #122068. DEBPKG:fixes/test-builder-reset - Reset inside subtest maintains parent DEBPKG:debian/hppa_opmini_optimize_workaround - Lower the optimization level of opmini.c on hppa DEBPKG:debian/sh4_op_optimize_workaround - Also lower the optimization level of op.c and opmini.c on sh4 DEBPKG:fixes/json-pp-example - [ #92793] fix RT-92793: bug in SYNOPSIS DEBPKG:debian/perldoc-pager - [ #120229] Fix perldoc terminal escapes when sensible-pager is less DEBPKG:debian/prune_libs - Prune the list of libraries wanted to what we actually need. DEBPKG:debian/configure-regen - Regenerate Configure et al. after probe unit changes DEBPKG:fixes/rename-filexp.U-phase1 - regen-configure: rename filexp.U to filexp_path.U\, phase 1 DEBPKG:fixes/rename-filexp.U-phase2 - regen-configure: rename filexp.U to filexp_path.U\, phase 2 DEBPKG:fixes/packaging_test_skips - Skip various tests if PERL_BUILD_PACKAGING is set DEBPKG:debian/mod_paths - Tweak @INC ordering for Debian DEBPKG:fixes/encode-alias-regexp - fix DEBPKG:fixes/regex-memory-leak - [910a6a8] [perl #132892] perl #132892: avoid leak by mortalizing temporary copy of pattern DEBPKG:fixes/CVE-2018-6797 - [perl #132227] (perl #132227) restart a node if we change to uni rules within the node and encounter a sharp S DEBPKG:fixes/CVE-2018-6798/pt1 - [perl #132063] Heap buffer overflow DEBPKG:fixes/CVE-2018-6798/pt2 - [perl #132063] 5.26.1: fix TRIE_READ_CHAR and DECL_TRIE_TYPE to account for non-utf8 target DEBPKG:fixes/CVE-2018-6798/pt3 - [perl #132063] (perl #132063) we should no longer warn for this code DEBPKG:fixes/CVE-2018-6798/pt4 - [perl #132063] utf8.c: Don't dump malformation past first NUL DEBPKG:fixes/CVE-2018-6913 - [perl #131844] (perl #131844) fix various space calculation issues in pp_pack.c
@INC for perl 5.26.1: /home/jima/lib/perl /home/jima/perl5/lib/perl5/x86_64-linux-gnu-thread-multi /home/jima/perl5/lib/perl5/5.26.1/x86_64-linux-gnu-thread-multi /home/jima/perl5/lib/perl5/5.26.1 /home/jima/perl5/lib/perl5/x86_64-linux-gnu-thread-multi /home/jima/perl5/lib/perl5 /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 /usr/local/share/perl/5.26.1 /usr/lib/x86_64-linux-gnu/perl5/5.26 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.26 /usr/share/perl/5.26 /home/jima/perl5/lib/perl5/5.26.0 /home/jima/perl5/lib/perl5/5.26.0/x86_64-linux-gnu-thread-multi /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base
Environment for perl 5.26.1: HOME=/home/jima LANG=en_US.UTF-8 LANGUAGE (unset) LC_COLLATE=C LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/jima/.local/bin:/home/jima/perl5/bin:/bin:/home/jima/bin:/home/jima/jima_tools/x86_64/bin:/home/jima/jima_tools/bin:/usr/bin:/usr/sbin:/sbin:/usr/bin/X11:/usr/local/bin:/usr/local/sbin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin:. PERL5LIB=/home/jima/lib/perl:/home/jima/perl5/lib/perl5/x86_64-linux-gnu-thread-multi:/home/jima/perl5/lib/perl5 PERL_BADLANG (unset) PERL_LOCAL_LIB_ROOT=/home/jima/perl5 PERL_MB_OPT=--install_base /home/jima/perl5 PERL_MM_OPT=INSTALL_BASE=/home/jima/perl5 SHELL=/bin/bash
On Wed\, May 2\, 2018 at 9:39 PM\, Jim Avera (via RT) \perlbug\-followup@​perl\.org wrote:
`perlfunc -f sysread` says using :utf8 handles are perfectly okay:
Note that if the filehandle has been marked as "​:utf8"\, Unicode characters are read instead of bytes \(the LENGTH\, OFFSET\, and the return value of "sysread" are in Unicode characters\)\. The "​:encoding\(\.\.\.\)" layer implicitly introduces the "​:utf8" layer\. See "binmode"\, "open"\, and the open pragma\.
However doing so provikes this at run time:
sysread\(\) is deprecated on :utf8 handles\. This will be a fatal error in Perl 5\.30
Suggest changing the documentation to say that this feature is deprecated\, so people don't waste time writing code which will become wrong later.
Indeed this should be modified.
The RT System itself - Status changed from 'new' to 'open'
IMO this should be considered a blocker for 5.28\, as it is a documentation issue for a change in this release.
On Wed\, May 2\, 2018 at 12:46 PM\, Leon Timmermans \fawaka@​gmail\.com wrote:
On Wed\, May 2\, 2018 at 9:39 PM\, Jim Avera (via RT) \perlbug\-followup@​perl\.org wrote:
`perlfunc -f sysread` says using :utf8 handles are perfectly okay:
Note that if the filehandle has been marked as "​:utf8"\, Unicode characters are read instead of bytes \(the LENGTH\, OFFSET\, and the return value of "sysread" are in Unicode characters\)\. The "​:encoding\(\.\.\.\)" layer implicitly introduces the "​:utf8" layer\. See "binmode"\, "open"\, and the open pragma\.
However doing so provikes this at run time:
sysread\(\) is deprecated on :utf8 handles\. This will be a fatal error
in Perl 5.30
Suggest changing the documentation to say that this feature is deprecated\, so people don't waste time writing code which will become wrong later.
Indeed this should be modified.
On Wed\, 02 May 2018 12:39:47 -0700\, wrote:
`perlfunc -f sysread` says using :utf8 handles are perfectly okay:
Note that if the filehandle has been marked as ":utf8"\, Unicode characters are read instead of bytes (the LENGTH\, OFFSET\, and the return value of "sysread" are in Unicode characters). The ":encoding(...)" layer implicitly introduces the ":utf8" layer. See "binmode"\, "open"\, and the open pragma.
However doing so provikes this at run time:
sysread() is deprecated on :utf8 handles. This will be a fatal error in Perl 5.30
Suggest changing the documentation to say that this feature is deprecated\, so people don't waste time writing code which will become wrong later.
How about the attached?
On 5/2/18 9:21 PM\, Tony Cook via RT wrote | How about the attached?
Hi Tony\,
Is it specifically :utf8 which will not be allowed\, i.e.\, other layers might still be allowed on a sysread file handle in v5.30? I didn't understand the new text which discussed interactions between the :utf8 layer and other layers such as :utf16.
Does it all boil down to requiring that the file handle read raw binary octets (e.g. after binmode($fh) is called)? If so it might be better to just say the file handle must be in :raw mode rather than mention any _specific_ encoding such as utf8.
On Wed\, 02 May 2018 23:40:58 -0700\, wrote:
On 5/2/18 9:21 PM\, Tony Cook via RT wrote | How about the attached?
Hi Tony\,
Is it specifically :utf8 which will not be allowed\, i.e.\, other layers might still be allowed on a sysread file handle in v5.30? I didn't understand the new text which discussed interactions between the :utf8 layer and other layers such as :utf16.
Does it all boil down to requiring that the file handle read raw binary octets (e.g. after binmode($fh) is called)? If so it might be better to just say the file handle must be in :raw mode rather than mention any _specific_ encoding such as utf8.
The problem isn't all layers.
The problem is specifically the way sysread etc handle layers that have the PERLIO_K_UTF8 flag set on them.
This includes the :utf8 layer (which is currently not a real layer) and :encoding() (as the sysread documentation mentions) and a hypothetical :utf16 layer would also set it\, assuming it's intended to decode utf-16 characters into perl's internal extended UTF-8 so perl can deal with it as characters.
The underlying problem is that sysread() etc pay attention to only one part of the layer stack - whether that PERLIO_K_UTF8 flag is set\, at which point it ignores the rest\, slurps in the bytes and marks them as SVf_UTF8.
With non-PERLIO_K_UTF8 layers sysread etc completely ignore the layers - reading (or writing) bytes from/to the underlying stream.
On 5/3/18 2:40 AM\, Tony Cook via RT wrote:
The underlying problem is that sysread() etc pay attention to only one part of the layer stack - whether that PERLIO_K_UTF8 flag is set\, at which point it ignores the rest\, slurps in the bytes and marks them as SVf_UTF8.
With non-PERLIO_K_UTF8 layers sysread etc completely ignore the layers - reading (or writing) bytes from/to the underlying stream.
Hmm. That's an unfortunate complexity involving perl's internal character representation which users really shouldn't need to be aware of. I hope some solution can be found which doesn't _require_ documenting and user-understanding of this.
Is there any foreseeable path to making sysread() handle arbitrary layers correctly\, using buffering when data-transforming layers are present but not otherwise? What if sysread just called fh->read() in those cases?
If buffering is used\, then: If the underlying device is seekable\, left-over octets in the hidden buffer should be discarded and a seek done so they will be re-read later; that would protect coherency if other cooperating processes might randomly update the file.
If the underlying source is not seekable\, then left-over octets would have to stay in the hidden buffer\, but that's okay because there is no way for those bytes to mutate before they are called for by the application. Note that for a tty in canonical mode\, the OS will only return one line at a time at least on *nix.
Just some uninformed ideas...
On 5/3/18 4:40 PM\, Jim Avera wrote:
What if sysread just called fh->read() in those cases?
In essence\, my proposal is to make sysread() an synonym for fh->read() with the exception that if the underlying source is seekable\, then any left-over octets (not needed to satisfy LENGTH characters) would be discarded after each call and a seek done to re-read them later; and\, that buffering will be entirely skipped if there is no data-transforming layer on the file descriptor.
Happily\, :encoding(utf8) is not data-transforming because that is perl's internal representation so the octets can simply be put into the user's buffer and the utf8 flag set.
Even transforming decoders might often avoid left-over octets (and thus avoid the seek-back) by predicting the number of octets needed in common cases. For example\, a UTF-16 decoder could read LENGTH*2 octets and that would suffice if the codepoints happened to be ascii. More realistically a ISO-8859-1 decoder could guess LENGTH*1 and often be right. In other words\, seeking-back might not be a big performance hit in practice. And any really perf-sensitive app shouldn't be using layers at all\, but should sysread() a raw file handle and do its own decoding.
On Fri\, May 4\, 2018 at 1:40 AM\, Jim Avera \jim\.avera@​gmail\.com wrote:
Is there any foreseeable path to making sysread() handle arbitrary layers correctly\, using buffering when data-transforming layers are present but not otherwise?
If you want that\, why wouldn't you just use read?
On 5/3/18 5:05 PM\, Leon Timmermans wrote:
On Fri\, May 4\, 2018 at 1:40 AM\, Jim Avera \jim\.avera@​gmail\.com wrote:
Is there any foreseeable path to making sysread() handle arbitrary layers correctly\, using buffering when data-transforming layers are present but not otherwise? If you want that\, why wouldn't you just use read?
Yes\, but I gather there is all this complexity (desired by someone) to allow certain layers to work with sysread(). Personally I would be happy if sysread simply disallowed any layers\, i.e. required a raw file handle.
On the other hand\, if the app wants Unicode characters\, it is convenient that perl's internal rep is utf8\, so reading from a fh with :encoding(utf8) should be possible with no actual extra overhead (just setting the utf8 flag on the user's buffer). Disallowing that one case seems strange from a user perspective.
On Thu\, May 3\, 2018 at 8:14 PM\, Jim Avera \jim\.avera@​gmail\.com wrote:
On 5/3/18 5:05 PM\, Leon Timmermans wrote:
On Fri\, May 4\, 2018 at 1:40 AM\, Jim Avera \jim\.avera@​gmail\.com wrote:
Is there any foreseeable path to making sysread() handle arbitrary layers correctly\, using buffering when data-transforming layers are present but not otherwise?
If you want that\, why wouldn't you just use read?
Yes\, but I gather there is all this complexity (desired by someone) to allow certain layers to work with sysread(). Personally I would be happy if sysread simply disallowed any layers\, i.e. required a raw file handle.
On the other hand\, if the app wants Unicode characters\, it is convenient that perl's internal rep is utf8\, so reading from a fh with :encoding(utf8) should be possible with no actual extra overhead (just setting the utf8 flag on the user's buffer). Disallowing that one case seems strange from a user perspective.
From a user perspective\, the utf8 flag should be irrelevant\, and the non-strict :utf8 or :encoding(utf8) layers shouldn't be used.
On Thu\, 03 May 2018 16:41:07 -0700\, wrote:
On 5/3/18 2:40 AM\, Tony Cook via RT wrote:
The underlying problem is that sysread() etc pay attention to only one part of the layer stack - whether that PERLIO_K_UTF8 flag is set\, at which point it ignores the rest\, slurps in the bytes and marks them as SVf_UTF8.
With non-PERLIO_K_UTF8 layers sysread etc completely ignore the layers - reading (or writing) bytes from/to the underlying stream.
Hmm. That's an unfortunate complexity involving perl's internal character representation which users really shouldn't need to be aware of. I hope some solution can be found which doesn't _require_ documenting and user-understanding of this.
Is there any foreseeable path to making sysread() handle arbitrary layers correctly\, using buffering when data-transforming layers are present but not otherwise? What if sysread just called fh->read() in those cases?
Well\, that would completely change the behaviour of sysread() in the case of non-UTF-8 flagged file handles that have other layers on them.
One reason for making this a deprecation warning is so we're not silently changing this behaviour.
This deprecation was originally discussed in:
On 5/6/18 4:35 PM\, Tony Cook via RT wrote:
Well\, that would completely change the behaviour of sysread() in the case of non-UTF-8 flagged file handles that have other layers on them. ... This deprecation was originally discussed in:
That seems to be some kind of secret or protected ticket!
RT Error No permission to display that ticket No details
On Sun\, 06 May 2018 21:14:24 -0700\, wrote:
On 5/6/18 4:35 PM\, Tony Cook via RT wrote:
Well\, that would completely change the behaviour of sysread() in the case of non-UTF-8 flagged file handles that have other layers on them. ... This deprecation was originally discussed in:
That seems to be some kind of secret or protected ticket!
RT Error No permission to display that ticket No details
I can see it as an anonymous guest (I opened a new browser).
Searching for the ticket number sent me to:
as did pasting the non-/Public/ address into the address bar.
If you still can't see it you might want to check with perlbug-admin (see the page footer) to see if something is messed up for your account.
On Sun\, 06 May 2018 23:35:33 GMT\, tonyc wrote:
On Thu\, 03 May 2018 16:41:07 -0700\, wrote:
On 5/3/18 2:40 AM\, Tony Cook via RT wrote:
The underlying problem is that sysread() etc pay attention to only one part of the layer stack - whether that PERLIO_K_UTF8 flag is set\, at which point it ignores the rest\, slurps in the bytes and marks them as SVf_UTF8.
With non-PERLIO_K_UTF8 layers sysread etc completely ignore the layers - reading (or writing) bytes from/to the underlying stream.
Hmm. That's an unfortunate complexity involving perl's internal character representation which users really shouldn't need to be aware of. I hope some solution can be found which doesn't _require_ documenting and user-understanding of this.
Is there any foreseeable path to making sysread() handle arbitrary layers correctly\, using buffering when data-transforming layers are present but not otherwise? What if sysread just called fh->read() in those cases?
Well\, that would completely change the behaviour of sysread() in the case of non-UTF-8 flagged file handles that have other layers on them.
One reason for making this a deprecation warning is so we're not silently changing this behaviour.
This deprecation was originally discussed in:
Tony: Should the patch you proposed in this RT be applied now?
Thank you very much.
-- James E Keenan (
On Wed\, 17 Oct 2018 06:13:06 -0700\, jkeenan wrote:
On Sun\, 06 May 2018 23:35:33 GMT\, tonyc wrote:
On Thu\, 03 May 2018 16:41:07 -0700\, wrote:
On 5/3/18 2:40 AM\, Tony Cook via RT wrote:
The underlying problem is that sysread() etc pay attention to only one part of the layer stack - whether that PERLIO_K_UTF8 flag is set\, at which point it ignores the rest\, slurps in the bytes and marks them as SVf_UTF8.
With non-PERLIO_K_UTF8 layers sysread etc completely ignore the layers - reading (or writing) bytes from/to the underlying stream.
Hmm. That's an unfortunate complexity involving perl's internal character representation which users really shouldn't need to be aware of. I hope some solution can be found which doesn't _require_ documenting and user-understanding of this.
Is there any foreseeable path to making sysread() handle arbitrary layers correctly\, using buffering when data-transforming layers are present but not otherwise? What if sysread just called fh->read() in those cases?
Well\, that would completely change the behaviour of sysread() in the case of non-UTF-8 flagged file handles that have other layers on them.
One reason for making this a deprecation warning is so we're not silently changing this behaviour.
This deprecation was originally discussed in:
Tony: Should the patch you proposed in this RT be applied now?
No\, this ticket is obsoleted by those operators now being fatal on :utf8 handles and the documentation updates that included.
On Mon\, 22 Oct 2018 23:53:10 GMT\, tonyc wrote:
On Wed\, 17 Oct 2018 06:13:06 -0700\, jkeenan wrote:
On Sun\, 06 May 2018 23:35:33 GMT\, tonyc wrote:
On Thu\, 03 May 2018 16:41:07 -0700\, wrote:
On 5/3/18 2:40 AM\, Tony Cook via RT wrote:
The underlying problem is that sysread() etc pay attention to only one part of the layer stack - whether that PERLIO_K_UTF8 flag is set\, at which point it ignores the rest\, slurps in the bytes and marks them as SVf_UTF8.
With non-PERLIO_K_UTF8 layers sysread etc completely ignore the layers - reading (or writing) bytes from/to the underlying stream.
Hmm. That's an unfortunate complexity involving perl's internal character representation which users really shouldn't need to be aware of. I hope some solution can be found which doesn't _require_ documenting and user-understanding of this.
Is there any foreseeable path to making sysread() handle arbitrary layers correctly\, using buffering when data-transforming layers are present but not otherwise? What if sysread just called fh-
read() in those cases?
Well\, that would completely change the behaviour of sysread() in the case of non-UTF-8 flagged file handles that have other layers on them.
One reason for making this a deprecation warning is so we're not silently changing this behaviour.
This deprecation was originally discussed in:
Tony: Should the patch you proposed in this RT be applied now?
No\, this ticket is obsoleted by those operators now being fatal on :utf8 handles and the documentation updates that included.
Ok\, closing.
-- James E Keenan (
@jkeenan - Status changed from 'open' to 'rejected'
Migrated from (status was 'rejected')
Searchable as RT133170$