habitat-sh / core-plans

Core Habitat Plan definitions
130 stars 252 forks source link

Stripped ld.so breaks valgrind #850

Open stevendanna opened 7 years ago

stevendanna commented 7 years ago

The current version of ld.so that we ship is stripped:

# file /hab/pkgs/core/glibc/2.22/20170513201042/lib64/ld-2.22.so
/hab/pkgs/core/glibc/2.22/20170513201042/lib64/ld-2.22.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped

This is of course expected; however, it does cause some problems. For instance valgrind doesn't work on executables using this stripped loader:

# /hab/pkgs/core/valgrind/3.13.0/20171011005408/bin/valgrind ls
==13113== Memcheck, a memory error detector
==13113== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==13113== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==13113== Command: ls
==13113==

valgrind:  Fatal error at startup: a function redirection
valgrind:  which is mandatory for this platform-tool combination
valgrind:  cannot be set up.  Details of the redirection are:
valgrind:
valgrind:  A must-be-redirected function
valgrind:  whose name matches the pattern:      strlen
valgrind:  in an object with soname matching:   ld-linux-x86-64.so.2
valgrind:  was not found whilst processing
valgrind:  symbols from the object with soname: ld-linux-x86-64.so.2
valgrind:
valgrind:  Possible fixes: (1, short term): install glibc's debuginfo
valgrind:  package on this machine.  (2, longer term): ask the packagers
valgrind:  for your Linux distribution to please in future ship a non-
valgrind:  stripped ld.so (or whatever the dynamic linker .so is called)
valgrind:  that exports the above-named function using the standard
valgrind:  calling conventions for this platform.  The package you need
valgrind:  to install for fix (1) is called
valgrind:
valgrind:    On Debian, Ubuntu:                 libc6-dbg
valgrind:    On SuSE, openSuSE, Fedora, RHEL:   glibc-debuginfo
valgrind:
valgrind:  Note that if you are debugging a 32 bit process on a
valgrind:  64 bit system, you will need a corresponding 32 bit debuginfo
valgrind:  package (e.g. libc6-dbg:i386).
valgrind:
valgrind:  Cannot continue -- exiting now.  Sorry.

This may have been working in the past because of the bug fixed in habitat-sh/habitat@8e8ab126c65a7da8f7cb4b44875683f99fa91c89

The valgrind error message points to 2 solutions: (1) avoid stripping ld.so or (2) ship debugging information in a separate package like some distributions do. (2) would also help for tools like gdb which know how to read debug symbols from auxiliary locations.

https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html http://valgrind.org/docs/manual/dist.readme-packagers.html

fnichol commented 6 years ago

I can buy this. Looking at what Arch Linux does, they appear to selectively strip some but not all of the ELF binaries: https://git.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/glibc#n163. They even call out not stripping ld-${pkgver}.so so as to not break gdb, valgrind, etc.

I'll see what an update might look like here…

fnichol commented 6 years ago

Here is my testing diff to the Glibc plan (currently against the base plans refresh PR):

diff --git a/glibc/plan.sh b/glibc/plan.sh
index 62c63f06..92d5b948 100644
--- a/glibc/plan.sh
+++ b/glibc/plan.sh
@@ -322,6 +322,29 @@ EOF
   popd > /dev/null
 }

+do_strip() {
+  build_line "Stripping unneeded symbols from binaries and libraries"
+  find $pkg_prefix -type f -perm -u+w -print0 2> /dev/null \
+    | while read -rd '' f; do
+      case "$(basename "$f")" in
+        "ld-${pkg_version}.so"|\
+        "libc-${pkg_version}.so"|\
+        "libpthread-${pkg_version}.so"|\
+        libpthread_db-1.0.so)
+          build_line "Skipping strip for $f"
+          continue
+          ;;
+      esac
+
+      case "$(file -bi "$f")" in
+        *application/x-executable*) strip --strip-all "$f";;
+        *application/x-sharedlib*) strip --strip-unneeded "$f";;
+        *application/x-archive*) strip --strip-debug "$f";;
+        *) continue;;
+      esac
+    done
+}
+
 do_end() {
   # Clean up the `pwd` link, if we set it up.
   if [[ -n "$_clean_pwd" ]]; then
bdangit commented 6 years ago

@fnichol , with the patch you posted, is that the only thing we would need to do in order to get core/glibc to support valgrind?

bdangit commented 5 years ago

submitted #2202 so we can get some traction on this. It would be nice to have, although there is something about ArchLinux plan that allows them to strip the shared objects as optional. Perhaps we should have glibc-debug which includes non-stripped shared objects vs glibc which does ship stripped objects.