Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.85k stars 527 forks source link

Problem with unicode in Getopt::Long HelpMessage() in perl 5.38 #21841

Closed bessarabov closed 3 weeks ago

bessarabov commented 4 months ago

Description We have observed that starting from version 5.38.0, there has been a change in behavior.

We are using official docker images of perl.

Here is the test script, I've saved it in file not_working.pl:

use utf8;
use strict;
use warnings;
use feature qw(say);
use open qw(:std :utf8);

use Getopt::Long qw(GetOptions HelpMessage);

say 'привет';

HelpMessage();

__END__

=encoding utf-8

=head1 USAGE

你好

On docker image perl:5.36.3 (the latest image of 5.36) it is working as expected:

$ docker run --rm -it -v `pwd`:/app/ perl:5.36.3 perl /app/not_working.pl
привет
Usage:
    你好

But starting from perl:5.38.0. it does not work. See that instaed of 你好 it output garbage:

$ docker run --rm -it -v `pwd`:/app/ perl:5.38.0 perl /app/not_working.pl
привет
Usage:
    你好

The problem still persist on perl:5.38.2 (the latest 5.38.*):

$ docker run --rm -it -v `pwd`:/app/ perl:5.38.2 perl /app/not_working.pl
привет
Usage:
    你好

Steps to Reproduce

Please see the prvious section for the exact steps.

Expected behavior

I expect that the output of the script running in Perl version 5.38.2 should be identical to that in Perl version 5.36.3. However, the output differs.

Perl configuration

# perl -V output goes here

I'm using the official docker image, but here it is:

$ docker run --rm -it perl:5.38.2 perl -V
Summary of my perl5 (revision 5 version 38 subversion 2) configuration:

  Platform:
    osname=linux
    osvers=6.2.0-1018-azure
    archname=x86_64-linux-gnu
    uname='linux buildkitsandbox 6.2.0-1018-azure #18~22.04.1-ubuntu smp tue nov 21 19:25:02 utc 2023 x86_64 gnulinux '
    config_args='-Darchname=x86_64-linux-gnu -Duse64bitall -Duseshrplib -Dvendorprefix=/usr/local -des'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cc'
    ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2'
    optimize='-O2'
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='12.2.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/x86_64-linux-gnu /usr/lib /usr/lib64
    libs=-lpthread -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -ldl -lm -lcrypt -lutil -lc
    libc=/lib/x86_64-linux-gnu/libc.so.6
    so=so
    useshrplib=true
    libperl=libperl.so
    gnulibc_version='2.36'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E -Wl,-rpath,/usr/local/lib/perl5/5.38.2/x86_64-linux-gnu/CORE'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl):
  Compile-time options:
    HAS_LONG_DOUBLE
    HAS_STRTOLD
    HAS_TIMES
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under linux
  Compiled at Jan 12 2024 00:32:56
  @INC:
    /usr/local/lib/perl5/site_perl/5.38.2/x86_64-linux-gnu
    /usr/local/lib/perl5/site_perl/5.38.2
    /usr/local/lib/perl5/vendor_perl/5.38.2/x86_64-linux-gnu
    /usr/local/lib/perl5/vendor_perl/5.38.2
    /usr/local/lib/perl5/5.38.2/x86_64-linux-gnu
    /usr/local/lib/perl5/5.38.2
$
zhmylove commented 4 months ago

This bug is annoying, right

xenu commented 4 months ago

Downgrading podlators to 4.14 restores the old behaviour.

I think this is the same issue: https://github.com/rra/podlators/issues/25

xenu commented 4 months ago

Found a simpler workaround than downgrading: add use PerlIO; to your code.

This is because Pod::Text code does this:

    eval {
        my @options = (output => 1, details => 1);
        my $flag = (PerlIO::get_layers ($$self{output_fh}, @options))[-1];
        if ($flag && ($flag & PerlIO::F_UTF8 ())) {
            $$self{ENCODE} = 0;
        }
    };

which throws a quietly discarded exception: Undefined subroutine &PerlIO::F_UTF8 called at /home/xenu/git/podlators/blib/lib/Pod/Text.pm line 368.

bessarabov commented 3 months ago

@xenu Huge thank you for investigating this issue!

You are absolutely right — in perl 5.38 there is a change:

podlators has been upgraded from version 4.14 to 5.01

(this is the quote from https://perldoc.perl.org/5.38.0/perldelta )

My issue is completly solved, I've updated the base images that I use for 5.38. I'm using images based on 2 official perl docker images, here are the fragments of my Dockerfile's with the main things:

FROM perl:5.38.2-bullseye

RUN cpanm Pod::Man@4.14

# ... all other docker steps
FROM perl:5.38.2

RUN cpanm Pod::Man@4.14

# ... all other docker steps

For me the problem is solved, but I'm not sure that I should do with this GitHub issue.

Probably I can close it right now, or probably it should be closed only when there are official perl docker images that don't have such problem (it can be achieved by using versions 4.14, or waiting for the new Pod::Man version that fix this issue; quite possible that this will be addressed in https://github.com/rra/podlators/issues/25 )

jkeenan commented 3 months ago

Bisection confirms that this problem entered the Perl 5 core distribution with this commit in December 2022:

commit 20616d517ad4726a85ee27750db6e24443343601 (HEAD, refs/bisect/bad)
Author:     Russ Allbery <rra@stanford.edu>
AuthorDate: Tue Dec 6 22:40:57 2022 +0000
Commit:     James E Keenan <jkeenan@cpan.org>
CommitDate: Fri Dec 9 10:43:28 2022 -0500

    Update podlators to CPAN version 5.00

However, that commit was quite large, so simply identifying the offending commit does not, in and of itself, identify the problem or the remedy. Further research needed (which will first have to be tested upstream).

Note: Perl 5 Porters does not, to my knowledge, maintain any "official Docker images." We only maintain the core distribution, for which we issue monthly developmental releases, an annual production release (around May 20), and maintenance releases as needed. So you'll have to look elsewhere for Docker images.

tonycoz commented 3 months ago

Note: Perl 5 Porters does not, to my knowledge, maintain any "official Docker images." We only maintain the core distribution, for which we issue monthly developmental releases, an annual production release (around May 20), and maintenance releases as needed. So you'll have to look elsewhere for Docker images.

They aren't maintained directly by p5p, but docker images are maintained under this organization:

https://github.com/Perl/docker-perl https://hub.docker.com/_/perl

khwilliamson commented 4 weeks ago

Is this closable?

zhmylove commented 4 weeks ago

I do not think so: cpan/podlators still contains the issue on blead, ref: https://github.com/Perl/perl5/blob/67164c39687a916908b6322d361cdf3d86df59ac/cpan/podlators/lib/Pod/Text.pm#L365

It probably should be updated

zhmylove commented 3 weeks ago

@reneeb what do you think? You was the last podlators updater.

This module should be either updated or patched. The cpan version is 5.01, while the mainline in repo does already contain the fix. Maybe we should ask @rra to release next version and upload it to cpan?

rra commented 3 weeks ago

The next release of podlators will be a major version bump with quite a lot of changes, and we're outside the merge window for dual-life modules.

I think the best strategy for fixing this bug in Perl 5.40, if this is possible (I'm not deeply familiar with the core rules about dual-life Perl modules), would be to cherry-pick https://github.com/rra/podlators/commit/16670945b10ce7f0ee6fa215598913db5da115f9 into core. That's a very small, low-impact change that only fixes this bug, and then the divergence can be dropped after the release freeze when the next major version of podlators is imported.

I could also make a patch release if that's necessary, but I will hopefully release podlators v6.0.0 within the next week or two and have so far avoided needing to juggle patch releases.

xenu commented 3 weeks ago

IMO this should be cherry-picked, it's a trivial one-line change and it fixes a real issue.

However, at this stage I think this needs a PSC decision.

I'm marking this as a blocker.

xenu commented 3 weeks ago

PR with a cherry-pick: https://github.com/Perl/perl5/pull/22165