mquinson / po4a

Maintain the translations of your documentation with ease (PO for anything)
http://po4a.org/
GNU General Public License v2.0
126 stars 62 forks source link

Error while build v0.71 on openSUSE #495

Closed elchevive closed 4 months ago

elchevive commented 5 months ago

Hi,

I'm trying to update po4a from v0.69 to v0.71 on openSUSE but on the versions of openSUSE with older software (Leap 15.5 and the to be released 15.6) I got this error message:

[   13s] Discard blib/man/ca/man3/Locale::Po4a::Xhtml.3pm.pod (13 of 21 strings; only 61.9% translated; need 80%).
[   13s] Malformed encoding while writing to file /home/abuild/rpmbuild/BUILD/po4a-0.71/blib/man/ca/man3/Locale::Po4a::Xml.3pm.pod with charset UTF-8: "\x{00c3}" does not map to utf8 at lib/Locale/Po4a/TransTractor.pm line 513.
[   13s] Close with partial character at lib/Locale/Po4a/TransTractor.pm line 526.
[   13s] Died at Po4aBuilder.pm line 191.
[   13s] error: Bad exit status from /var/tmp/rpm-tmp.QnsSUO (%build)

On tumbleweed (the rolling release one) it build. The only difference I saw is Perl version, 5.26 on Leap e 5.38 on Tumbleweed.

You can see my attempts on this repository:

https://build.opensuse.org/package/show/home:elchevive:branches:devel:languages:perl/po4a

Regards

mquinson commented 5 months ago

This is really weird. We require Perl 5.12 since po4a v0.70 because we changed the way we handle UTF files. But I still fail to understand why it would be an issue on Perl 5.26.

Could you give me the output of file po/pod/ca.po and of grep Encoding po/pod/ca.po please?

elchevive commented 5 months ago

Hi,

[    8s] + /usr/bin/mkdir /home/abuild/rpmbuild/BUILDROOT/po4a-0.71-150500.105.1.x86_64
[    8s] + cd po4a-0.71
[    8s] + file po/pod/ca.po
[    8s] po/pod/ca.po: GNU gettext message catalogue, UTF-8 Unicode text, with very long lines
[    8s] + grep Encoding po/pod/ca.po
[    8s] "Content-Transfer-Encoding: 8bit\n"
[    8s] + perl Build.PL installdirs=vendor
...
[   13s] Discard blib/man/ca/man3/Locale::Po4a::Xhtml.3pm.pod (13 of 21 strings; only 61.9% translated; need 80%).
[   13s] Malformed encoding while writing to file /home/abuild/rpmbuild/BUILD/po4a-0.71/blib/man/ca/man3/Locale::Po4a::Xml.3pm.pod with charset UTF-8: "\x{00c3}" does not map to utf8 at lib/Locale/Po4a/TransTractor.pm line 513.
[   13s] Close with partial character at lib/Locale/Po4a/TransTractor.pm line 526.
[   13s] Died at Po4aBuilder.pm line 191.
[   14s] error: Bad exit status from /var/tmp/rpm-tmp.GZEM2g (%build)
Gastonia02 commented 5 months ago

If that helps, I get the exact same issue on Ubuntu 20.04, with perl 5.30.0 and these commands shows the same results. except that instead of "\x{00c3}" I have "\x{fffd}"

I tried on the 0.70 version of po4a And got the same problem :

$ ./Build 
$ Created META.yml and META.json
$ "\x{fffd}" does not map to UTF-8 at lib/Locale/Po4a/Po.pm line 613.
$ Close with partial character at lib/Locale/Po4a/Po.pm line 613.
$ Died at Po4aBuilder.pm line 169.

Using the command you requested (file and grep), I get the same output with the exception of the , with very long lines which does not appear anymore

elchevive commented 5 months ago

Hi,

As an excercise I update Leap 15.5 perl packages to 5.38 and po4a compiles sucessfully, so its something that change in Perl between 5.30 (as mentioned by Gastonia02) and 5.38

mquinson commented 5 months ago

Thanks @elchevive, that's a precious info. Is there any chance to get the precise version of Perl for which po4a starts to fail?

I started reading the perldelta of each versions between 5.30 and 5.38, but that's quite a lot of changes actually.

rwmjones commented 5 months ago

Do we think this is the same bug as https://github.com/mquinson/po4a/issues/494 ?

elchevive commented 5 months ago

Hi,

Further testing shows me that some change between 5.33.6 (not working) and 5.33.7 (start working) should be the culprit.

mquinson commented 4 months ago

The diff between the two versions regarding PerlIO encoding seems to be the following: https://metacpan.org/release/ATOOMIC/perl-5.33.8/diff/HYDAHY%2Fperl-5.33.6/ext/PerlIO-encoding/encoding.pm The fallback setting does not contain Encode::STOP_AT_PARTIAL() anymore. Further digging underway.

The full diff between 5.33.6 and 5.33.7 is here: https://metacpan.org/release/RENEEB/perl-5.33.7/view/pod/perldelta.pod

mquinson commented 4 months ago

I fail to reproduce the error :( Could someone test that the following patch helps? Alternatively, the commented line could be used instead of the one added without comments.

--- a/lib/Locale/Po4a/TransTractor.pm
+++ b/lib/Locale/Po4a/TransTractor.pm
@@ -504,6 +504,8 @@ sub write {
             File::Path::mkpath( $dir, 0, 0755 )    # Croaks on error
               if ( length($dir) && !-e $dir );
         }
+        $PerlIO::encoding::fallback = FB_CROAK;
+        # $PerlIO::encoding::fallback = Encode::PERLQQ()|Encode::WARN_ON_ERR()|Encode::ONLY_PRAGMA_WARNINGS();
         open( $fh, ">:encoding($charset)", $filename )
           or croak wrap_msg( dgettext( "po4a", "Cannot write to %s: %s" ), $filename, $! );
     }
rwmjones commented 4 months ago

If https://github.com/mquinson/po4a/issues/494 is truly a duplicate of this bug, then no, neither of those lines fixed it.

The reproducer (of 494) is easy to test locally, just download ja.po and customize-synopsis.pod from the links given, and run:

PERLLIB=$PWD/lib ./po4a-translate -f pod -M utf-8 -L utf-8 -k 0 -m customize-synopsis.pod -p ja.po -l out -v -d
mquinson commented 4 months ago

Do we think this is the same bug as #494 ?

Nope, I don't think it's the same. I think that #495 is about partial chars being reported as an error in Perl 5.33 and not in modern ones while #494 was about a eval block returning false even in absence of error.

Another clue that it's not the same is that #495 shows the error msg Close with partial character at lib/Locale/Po4a/TransTractor.pm line 526 while #494 does not show anything before dying ("unknown error").

And a final clue: I was able to reproduce (and fix) #494 while I'm still trying to reproduce #495

xloem commented 1 month ago

I'm still getting these errors on an old system running RHEL 7.1, Perl 5.16 v0.70: "\x{fffd}" does not map to UTF-8 at lib/Locale/Po4a/Po.pm line 613. v0.73-17-g76a463e5:

Malformed encoding while writing to file /shared/src/po4a/blib/man/ca/man3/Locale::Po4a::Xml.3pm.pod with charset UTF-8: "\x{fffd}" does not map to UTF-8 at 
lib/Locale/Po4a/TransTractor.pm line 544.
If UTF-8 is not the expected charset, you need to configure the right one with with --localized-charset or other similar flags.
Close with partial character at lib/Locale/Po4a/TransTractor.pm line 568.

v0.69 works

I bisected the failure to the merges around 15abd24f3802071c0560f5b4e87ee75ce8fde0c7 . 15abd24f3802071c0560f5b4e87ee75ce8fde0c7^ succeeds whereas b2333d54845976f4804f024b5cea61db16fb4f36 fails. 15abd24f3802071c0560f5b4e87ee75ce8fde0c7 itself actually gives me a different error message:

po4a::xml: The file declares ISO-8859-1 as encoding, but you provided UTF-8 as master charset. Please change either setting.                                  
 at po4a line 1624.