rschupp / PAR-Packer

(perl) Generate stand-alone executables, perl scripts and PAR files https://metacpan.org/pod/PAR::Packer
Other
48 stars 13 forks source link

rs6000_71 (AIX) "An offset in the .loader section header is too large." #62

Closed GlennWood closed 2 years ago

GlennWood commented 2 years ago

We recently rebuilt our Perl-5.32.1 for rs6000_71 (AIX) using an updated compiler there. PAR::Packer built/installed by this bin/perl creates executables that are not executable, instead reporting

Could not load program pp_rs6000_test
An offset in the .loader section header is too large.
Examine the .loader section header with the 'dump -Hv' command.

We've dumped the header, it looks pretty damaged. We've compared the headers of this output with earlier ones that run correctly and there are a few offsets in the header that are a bit off, and one that is way off (looks like a high-bit got set). Our hypothesis is that the new compiler creates a binary with a slightly different object properties that PAR::Packer mis-identifies.

We can share with you these headers, binary diffs, and other information, but what we really want is help in locating where in the PAR::Packer code the header of the output is calculated - then we could localize the issue better to figure this out. We imagine that rs6000_71 is a platform few have handy to track this down with; we have one handy and might figure this out with some help.

Thank you.

attached "Hello, World" example that replicates this problem on rs6000_71 pp_rs6000_test.tgz

rschupp commented 2 years ago

Sorry, I don't understand what pp_rs6000_test.ksh tries to do... Anyway, the most basic hello-world test is:

pp -o hello -E 'say "hello, world"'
./hello

Does this produce the same error as in your report?

we really want is help in locating where in the PAR::Packer code the header of the output is calculated

With "header" you mean some part of the executable file pp_rs6000_test (whatever executable file format AIX uses)? PAR::Packer does not manipulate that. (Full disclosure: it does that on Windows, but on no other platform.)

hello is just boot.exe with "some stuff" tacked on, but that stuff is not registered with the executable format headers. But that has never been a problem...

GlennWood commented 2 years ago

Your suggestion has led me to a minimal-viable-test. It seems the issue stems from the --clean option.

#!/bin/ksh

# USAGE: 
#   ./pp_rs6000_test.ksh [ --clean ]
#
# Builds correctly without --clean, fails with --clean

# PERLROOT/lib is where PAR::Packer has been installed via cpan
PERLROOT="../perl5lib/perl-5.32.1/rs6000_71"
export PERL5LIB=$PERLROOT/lib/perl5

/bin/rm -f hello

echo "pp to create hello executable $1"
$PERLROOT/bin/pp -o hello -E 'say "hello, world"' $1

./hello

Doesn't make sense, does it? But here it is:

bash-4.4$ ./pp_rs6000_test.ksh 
pp to create hello executable 
hello, world
bash-4.4$ ./pp_rs6000_test.ksh --clean
pp to create hello executable --clean
Could not load program ./hello:
        An offset in the .loader section header is too large.
Examine the .loader section header with the 'dump -Hv' command.
rschupp commented 2 years ago

Bingo! I see what's going on. pp does patch myldr/boot before tacking on stuff - but only if you're using --clean. The modified bytes are all inside char arrays (but no bytes are added or deleted). Usually (e.g. on Linux or Windows) that doesn't make the executable unusable, but AIX seems to notice.

Out of curiosity, can you run

cmp -l your_hello_from_above your_PAR_Packer_build_dir/mylr/boot

For me (Linux) this shows 3 groups of 11 bytes each that are different (before the tacked on stuff starts).

pp does that because the generated executable somehow has to remember that it was produced with --clean.

I have to come up with another way to store that information - stay tuned!

rschupp commented 2 years ago

Please rebuild and check PAR::Packer from branch fix-pp-clean.

GlennWood commented 2 years ago

Congratulations! It worked! I owe you. We want to officially integrate this into our build process; when will we see an official version of this? I've already got PAR-Packer-1.055.tar.gz, which is all we need for our build process. Thank you/

GlennWood commented 2 years ago

Well, this is disturbing, especially as it happens only infrequently (about 1-in-5 times),

Failed to execute temporary parl (class PAR::StrippedPARL::Static) in file '/tmp/parlJynz': Not a typewriter 
       at /net/dsnetapp11svm01/sandboxes/NBCHECK/2764833/nbcheck/perl5lib/perl-5.32.1/rs6000_71/lib/perl5/PAR/StrippedPARL/Base.pm line 77,  line 1.
/net/dsnetapp11svm01/sandboxes/NBCHECK/2764833/nbcheck/perl5lib/perl-5.32.1/rs6000_71/bin/pp: Failed to extract a parl from 'PAR::StrippedPARL::Static' to file '/tmp/parl4wdDRKn' 
       at /net/dsnetapp11svm01/sandboxes/NBCHECK/2764833/nbcheck/perl5lib/perl-5.32.1/rs6000_71/lib/perl5/PAR/Packer.pm line 1216,  line 1.

Any ideas? I'll try to narrow it down, but it does seem random at this point.

rschupp commented 2 years ago

Congratulations! It worked!

Thanks.

We want to officially integrate this into our build process; when will we see an official version of this?

If nothing else comes up, I'll make a release over the weekend.

Well, this is disturbing, especially as it happens only infrequently (about 1-in-5 times),


Failed to execute temporary parl (class PAR::StrippedPARL::Static) in file '/tmp/parlJynz': Not a typewriter 
       at /net/dsnetapp11svm01/sandboxes/NBCHECK/2764833/nbcheck/perl5lib/perl-5.32.1/rs6000_71/lib/perl5/PAR/StrippedPARL/Base.pm line 77,  line 1.

No clue. What's happening here is: pp extracts a (uuencoded) blob from .../PAR/StrippedPARL/Static.pm, writes it to a temp file (given in the message), makes the file executable (0755) and tries to run it (the system() call on line 75 of .../PAR/StrippedPARL/Base.pm). Unfortunately the temp file is deleted when pp exists.

You could add $? to the warn() message to see what system() actually retuned ($! might be misleading), If you do that in the installed version of .../PAR/StrippedPARL/Base.pm you don't have to rebuild and reinstall PAR::Packer. You could also remove the UNLINK => 1 from the call of File::Temp::tempfile() a couple of lines back, so that the temp file isn't deleted. Then try to reproduce the problem and inspect the temp file (permissions?) It should be harmless to execute it (will only report missing arguments).

But it's most likely that the problem happens in the bowels of perl's built-in function system(). If AIX has something like the Linux or BSD utility strace to trace the system calls of a running process, you could run pp under it to see what's going on.

GlennWood commented 2 years ago

UPDATE: This is a i significant improvement; it solves the "bad-header" problem, but I've found another issue, vis-a-vis "Not a Typewriter", in rs6000, only. I will investigate and report further if I can't solve it myself.

rschupp commented 2 years ago

@GlennWood I just released 1.055 to CPAN. If the "Not a typewrite" problem persists, please open a new issue.