ksh93 / ksh

ksh 93u+m: KornShell lives! | Latest release: https://github.com/ksh93/ksh/releases
Eclipse Public License 2.0
192 stars 32 forks source link

Fix undefined symbol build error (re: e9278fa) #768

Closed JohnoKing closed 4 months ago

JohnoKing commented 4 months ago

Commit e9278fa exposed a build error that can occur when mamake attempts to use ksh as the shell for invoking exec scripts:

# src/lib/libast/Mamfile: 4796-4801: make ${INSTALLROOT}/dyn/lib/libast.so
+ LDFLAGS='  -Wl,-no-as-needed'
+ export LDFLAGS
+ sed 1d ast.req
+ dylink -m ast -v 6.0 -p lib -s .so state.o opendir.o --redacted object file list--
[dylink] + cc -Wl,-no-as-needed -o /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/dyn/lib/libast.so.6.0 --redacted--
# src/lib/libast/Mamfile: 4809-4811: make ${INSTALLROOT}/lib/mam
sh: symbol lookup error: /usr/lib/libcmd.so.2: undefined symbol: _ast_getpgrp
mamake [lib/libast]: *** exit code 127 making /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/mam
mamake: *** exit code 1 making lib/libast
mamake: *** exit code 1 making all
package: make failed at Sun Jul 21 16:16:04 PDT 2024 in /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64

This error is caused by mamake using the system's installed ksh binary, which is dynamically linked to its own libast.so. bin/package exports LD_LIBRARY_PATH and mamake inherits it, which causes ksh to try to use the freshly built libast.so produced by dylink. https://github.com/ksh93/ksh/blob/6f4ba06b8c8a5b8a8408be6b9c37882c80972eec/bin/package#L2072-L2074 The new libast.so is binary incompatible with the system's libcmd.so and by extension /bin/ksh, so the build fails with an undefined symbol error.

src/cmd/INIT/mamake.c: - After forking a child process for the exec script, unset LD_LIBRARY_PATH before invoking the shell, as there is no guarantee the shell isn't an older release of ksh incompatible with a freshly built libast.so.

McDutchie commented 4 months ago

Thanks for catching that!

I'm going to have to think about ABI stability at some point, but now is not yet that time. Not looking forward to that. :/

It looks like you had an older dynamically linked ksh found in $PATH link against the just-built libraries in arch/*/dyn/lib instead of the actually installed libraries. This is a problem. It should not happen during the build stage.

Just unsetting LD_LIBRARY_PATH is not going to be enough in any case, because other systems use different variables (e.g., macOS uses DYLD_LIBRARY_PATH).

About four years ago, in e08ca80d15baa21e20ec0c093fb5f02843b7fc14, I fixed package to not use a ksh from arch/*/bin that I just built as the shell to run the build scripts, because (obviously) it may well be broken while I'm tinkering with it, and that was sometimes causing me build failures.

Now that we're building dynamic libraries again, it follows that LD_LIBRARY_PATH and friends should not be set during the build stage either; they are not necessary for compiling and linking, only for executing. So they should be set for package test and package use, but not for package make.

I think fixing package to that effect would be a better approach.

McDutchie commented 4 months ago

This patch should do that. Could you test it? (I've not yet been able to reproduce the build failure myself.)

diff --git a/bin/package b/bin/package
index 2b1b3cd3..1e60b26b 100755
--- a/bin/package
+++ b/bin/package
@@ -2077,6 +2077,16 @@ case $x in
                                \"
                        done
                "
+
+               # allow loading dynamic libraries from $INSTALLROOT/dyn/lib
+               # by setting library path variables for various systems;
+               # instead of bothering to detect the system, just set them all
+
+               case $action in
+               make)   # ... but not while building; otherwise, if $SHELL is a dynamically
+                       # linked ksh binary, it may link against our preinstalled libraries
+                       ;;
+               *)      # <<< outdent
                case $LD_LIBRARY_PATH in
                '')     ;;
                *)      for d in $lib
@@ -2152,6 +2162,9 @@ case $x in
                        export LIBRARY_PATH
                        ;;
                esac
+                       # >>> indent
+                       ;;
+               esac

                # now set up PATH
                #
JohnoKing commented 4 months ago

The initial patch to bin/package fails to compile as it misses a prior usage of export LD_LIBRARY at L2073. https://github.com/ksh93/ksh/blob/6f4ba06b8c8a5b8a8408be6b9c37882c80972eec/bin/package#L2063-L2076 I have pushed an altered version that also avoids that export in bin/package and compiles successfully.

McDutchie commented 4 months ago

Thanks. Looks like there's more cleanup to do here, actually. We do not need to support IRIX (sgi.mips) any more.

McDutchie commented 4 months ago

There, that should sort it. Thanks for the report and the testing. Please let me know if you still encounter problems.