sgerrand / alpine-pkg-glibc

A glibc compatibility layer package for Alpine Linux
2.07k stars 280 forks source link

Fixed ld-linux link location to /lib #36

Closed dennybaa closed 7 years ago

dennybaa commented 7 years ago

64bit systems I suppose use /lib (if i'm not wrong), moreover shared libs linked against ld-linux-x86-64.so.2 fail to load without it.

frol commented 7 years ago

My experiments with 2.24-r0 from https://github.com/stackfeed/alpine-pkg-glibc/releases/ revealed that ld-linux-x86-64.so.2 MUST be in /lib64. Debian (+ *Ubuntu), CentOS (Fedora), Arch Linux, and I believe, others have /lib64/ld-linux-x86-64.so.2 and don't have /lib/ld-linux-x86-64.so.2, so 2.24-r0 simply doesn't work unless I add mkdir /lib64 && ln -s ../lib/ld-linux-x86-64.so.2 /lib64/.

dennybaa commented 7 years ago

@frol Hey there, actually I don't really know what MUST be there, but glibc-2.23 package is definitely MISCONFIGURED. How do you check? I've just checked for my specific case. I use hadoop which ships with native libs.

This is ubuntu 16.04 - OKAY:

➜  ~ ls -l hadoop-2.7.3/lib/native 
total 4380
-rw-r--r-- 1 denz denz 1123614 авг 18 04:49 libhadoop.a
-rw-r--r-- 1 denz denz 1487220 авг 18 04:49 libhadooppipes.a
lrwxrwxrwx 1 denz denz      18 авг 18 04:52 libhadoop.so -> libhadoop.so.1.0.0
-rwxr-xr-x 1 denz denz  673852 авг 18 04:49 libhadoop.so.1.0.0
-rw-r--r-- 1 denz denz  581952 авг 18 04:49 libhadooputils.a
-rw-r--r-- 1 denz denz  364884 авг 18 04:49 libhdfs.a
lrwxrwxrwx 1 denz denz      16 авг 18 04:52 libhdfs.so -> libhdfs.so.0.0.0
-rwxr-xr-x 1 denz denz  229145 авг 18 04:49 libhdfs.so.0.0.0
➜  ~ ldd hadoop-2.7.3/lib/native/libhdfs.so
    linux-vdso.so.1 =>  (0x00007fffb7de6000)
    libjvm.so => not found
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f45f93b4000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f45f9196000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f45f8dcd000)
    /lib64/ld-linux-x86-64.so.2 (0x000055e88f180000)
➜  ~ 

frolvlad/alpine-glibc - NOT OKAY

/ # apk info glibc
glibc-2.23-r3 description:
GNU C Library compatibility layer

glibc-2.23-r3 webpage:
https://github.com/sgerrand/alpine-pkg-glibc

glibc-2.23-r3 installed size:
4198400

/ # ls -l hadoop-2.7.3/lib/native/
total 4380
-rw-r--r--    1 root     root       1123614 Aug 18 01:49 libhadoop.a
lrwxrwxrwx    1 root     root            18 Feb  9 15:17 libhadoop.so -> libhadoop.so.1.0.0
-rwxr-xr-x    1 root     root        673852 Aug 18 01:49 libhadoop.so.1.0.0
-rw-r--r--    1 root     root       1487220 Aug 18 01:49 libhadooppipes.a
-rw-r--r--    1 root     root        581952 Aug 18 01:49 libhadooputils.a
-rw-r--r--    1 root     root        364884 Aug 18 01:49 libhdfs.a
lrwxrwxrwx    1 root     root            16 Feb  9 15:17 libhdfs.so -> libhdfs.so.0.0.0
-rwxr-xr-x    1 root     root        229145 Aug 18 01:49 libhdfs.so.0.0.0
/ # ldd hadoop-2.7.3/lib/native/libhdfs.so
    ldd (0x5615b6606000)
Error loading shared library libjvm.so: No such file or directory (needed by hadoop-2.7.3/lib/native/libhdfs.so)
    libdl.so.2 => ldd (0x5615b6606000)
    libpthread.so.0 => ldd (0x5615b6606000)
    libc.so.6 => ldd (0x5615b6606000)
Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by hadoop-2.7.3/lib/native/libhdfs.so)
Error relocating hadoop-2.7.3/lib/native/libhdfs.so: JNI_GetCreatedJavaVMs: symbol not found
Error relocating hadoop-2.7.3/lib/native/libhdfs.so: JNI_CreateJavaVM: symbol not found
Error relocating hadoop-2.7.3/lib/native/libhdfs.so: __strtok_r: symbol not found
/ # 

See Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by hadoop-2.7.3/lib/native/libhdfs.so)

And now stackfeed/alpine:glibc - OKAY:

➜  ~ docker run --rm -it stackfeed/alpine:glibc bash
bash-4.3# apk info glibc
glibc-2.24-r0 description:
GNU C Library compatibility layer

glibc-2.24-r0 webpage:
https://github.com/dennybaa/alpine-pkg-glibc

glibc-2.24-r0 installed size:
4190208

bash-4.3# ldd hadoop-2.7.3/lib/native/libhdfs.so
    ldd (0x55955727b000)
Error loading shared library libjvm.so: No such file or directory (needed by hadoop-2.7.3/lib/native/libhdfs.so)
    libdl.so.2 => ldd (0x55955727b000)
    libpthread.so.0 => ldd (0x55955727b000)
    libc.so.6 => ldd (0x55955727b000)
    ld-linux-x86-64.so.2 => /lib/ld-linux-x86-64.so.2 (0x7f9324668000)
Error relocating hadoop-2.7.3/lib/native/libhdfs.so: JNI_GetCreatedJavaVMs: symbol not found
Error relocating hadoop-2.7.3/lib/native/libhdfs.so: JNI_CreateJavaVM: symbol not found
Error relocating hadoop-2.7.3/lib/native/libhdfs.so: __strtok_r: symbol not found
bash-4.3# 

Mind that java is not installed in both cases so there are missing libs.

@frol So what checks have you performed?

That is all nice I do understand that on ubuntu etc ld is here: /lib64/ld-linux-x86-64.so.2, but what I see on alpine with glibc-compat and options (with which it's built and configured), ld lib SHOULD be placed into: /lib/ld-linux-x86-64.so.2 .

frol commented 7 years ago

@dennybaa You seem to use the Alpine Linux ldd (/usr/bin/ldd), which doesn't work correctly for Glibc binaries.

You should use LD_TRACE_LOADED_OBJECTS=1 to see what is really going on.

Here is my working image (yes it works, I have over 200k docker pulls on OracleJDK8 and Mono images):

$ docker run -it --rm -v /:/mnt frolvlad/alpine-glibc sh
/ # LD_TRACE_LOADED_OBJECTS=1 /mnt/bin/bash
        linux-vdso.so.1 (0x00007fffd7976000)
        libreadline.so.7 => not found
        libdl.so.2 => /usr/glibc-compat/lib/libdl.so.2 (0x00007f2303b59000)
        libc.so.6 => /usr/glibc-compat/lib/libc.so.6 (0x00007f23037bd000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f2303d5d000)
/ # /mnt/bin/bash
/mnt/bin/bash: error while loading shared libraries: libreadline.so.7: cannot open shared object file: No such file or directory
/ # LD_LIBRARY_PATH=/mnt/lib /mnt/bin/bash
bash-4.4#

This illustrated that I can run bash from my glibc-based host system (Arch Linux) just fine.

Let's give your image a spin:

$ docker run -it --rm -v /:/mnt stackfeed/alpine:glibc sh
/ # LD_TRACE_LOADED_OBJECTS=1 /mnt/bin/bash
sh: /mnt/bin/bash: not found
/ # /mnt/bin/bash
sh: /mnt/bin/bash: not found
/ # LD_LIBRARY_PATH=/mnt/lib /mnt/bin/bash
sh: /mnt/bin/bash: not found

This simply didn't work at all.

How about mkdir /lib64 && ln -s ../lib/ld-linux-x86-64.so.2 /lib64/?

$ docker run -it --rm -v /:/mnt stackfeed/alpine:glibc sh
/ # mkdir /lib64 && ln -s ../lib/ld-linux-x86-64.so.2 /lib64/
/ # LD_TRACE_LOADED_OBJECTS=1 /mnt/bin/bash
        linux-vdso.so.1 (0x00007ffefa5c3000)
        libreadline.so.7 => not found
        libdl.so.2 => /usr/glibc-compat/lib/libdl.so.2 (0x00007fe0dbc29000)
        libc.so.6 => /usr/glibc-compat/lib/libc.so.6 (0x00007fe0db890000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe0dbe2d000)
/ # /mnt/bin/bash
/mnt/bin/bash: error while loading shared libraries: libreadline.so.7: cannot open shared object file: No such file or directory
/ # LD_LIBRARY_PATH=/mnt/lib /mnt/bin/bash
bash-4.4#

WIN!

dennybaa commented 7 years ago

@frol right, I haven't tried that. I'm not challenging you :) So there are still issues with your container when libs are not loading correctly, I've showed that with hadoop. It's clear that I mitigated my specific issue by changing that.

You're right, my fix is not complete, clearly /lib64/ has to stay, and there should be linking to lib as well. Are we on the same page here?

frol commented 7 years ago

@dennybaa I have seen lots of weird behavior with /usr/bin/ldd. Could you please post the output of LD_TRACE_LOADED_OBJECTS=1 hadoop-2.7.3/lib/native/libhdfs.so? I have a few glibc-based applications (including Hadoop!) running on frolvlad/alpine-glibc image and none of them has failed on my yet.

dennybaa commented 7 years ago

@frol Sure. The point is that command fails on both containers:

LD_TRACE_LOADED_OBJECTS=1 hadoop-2.7.3/lib/native/libhdfs.so
Segmentation fault (core dumped)

But I assure you that you container doesn't fully work at the moment. The real world scenario, we use hadoop spark, it has the snappy codec (which is default). So with your version, what happens, libsnappy.so.... is unpacked from jar and spark actually fails to load it. My fix gives it a fresh breath and codec starts to work (however breaking all the rest :).

Same thing happens with HBASE, with your container if you read logs you'll see couldn't load hadoop native libraries...

At the moment I'm thinking that there should be a link to ld-so in /lib too, this is mitigation since I can't find such links or anything on ubuntu, debian etc... So either mitigation or smth should be tuned in glibc-compat...

frol commented 7 years ago

@dennybaa I used to run Spark 1.6.1 (+ Hadoop 2.6) leveraging it from PySpark + Jupyter Notebook on my frolvlad/alpine-oraclejdk8 (which is based on frolvlad/alpine-glibc). The only thing I did was ln -s /usr/local/lib/hadoop-native/* /usr/lib. The logs you refer to might have helped, but currently, I don't see any evidence that /lib/ld-linux-x86-64.so.2 is needed.

dennybaa commented 7 years ago

@frol Not quite I don't actually mean hadoop native libs, but yes that has to be done ^^^. Thanks for reminding.

Just now I've tried container with glibc v2.23 without link in /lib (original sgerrand glibc package). I start spark master in one container and open spark-shell as another container, this quarantines that things get serialized between "machines" (master and shell HAVE TO be started in separate containers). Well at the time snappy gets loaded from a jar, which I was talking before the following happens:

Caused by: java.lang.IllegalArgumentException: java.lang.UnsatisfiedLinkError: /tmp/snappy-1.1.2-d8bcaa90-cf09-48ee-b52e-8fe58d09c73b-libsnappyjava.so: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /tmp/snappy-1.1.2-d8bcaa90-cf09-48ee-b52e-8fe58d09c73b-libsnappyjava.so)
    at org.apache.spark.io.SnappyCompressionCodec$.liftedTree1$1(CompressionCodec.scala:171)
    at org.apache.spark.io.SnappyCompressionCodec$.org$apache$spark$io$SnappyCompressionCodec$$version$lzycompute(CompressionCodec.scala:168)
    at org.apache.spark.io.SnappyCompressionCodec$.org$apache$spark$io$SnappyCompressionCodec$$version(CompressionCodec.scala:168)
    at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:152)
    ... 62 more

My point is that software and namely spark when trying to load "smth.so" from a jar fails for the reason smth lacks in glibc-compat configuration... And I hope we can work it out, since glibc pkg is not usable at least for java spark and who knows maybe more :(

frol commented 7 years ago

@dennybaa Do you use OpenJDK from Alpine repos or glibc-compiled OracleJDK? I suggest you give frolvlad/alpine-oraclejdk8 image a try as if you use OpenJDK compiled against musl libc you may experience libc collision.

dennybaa commented 7 years ago

@frol I use openjdk:8-jre-alpine, so yes openjdk compiled for alpine. Your oraclejdk8 has worked yey!!! Heh I see your glibc-compiled oracle java works! Thank you for digging through this.

Though I need to use OpenJDK for my project... So what's your conclusion, that link still helps apps running on alpine openjdk, should it be and why?

frol commented 7 years ago

@dennybaa I am glad to hear that it worked! Congrats!

So what's your conclusion, that link still helps apps running on alpine openjdk, should it be and why?

If that works for some cases, I would just add another symlink to /lib. The thing I didn't quite get, however, did the /lib/ld-linux-x86-64.so.2 work for musl libc based OpenJDK? Or it just made it one step further into the broken state?

dennybaa commented 7 years ago

@frol Yeah Thanks! Yes putting the link into /lib/ld-linux-x86-64.so.2 for musl based OpenJDK, makes spark work on Alpine with glibc-compat. Originally I threw way /lib64 link without understanding or checking, it just fixed my spark and about the rest I forgot to care :)

I will post a PR then adding this solely link.

dennybaa commented 7 years ago

I guess this should be closed in favor #38, because this one breaks glibc.

sgerrand commented 7 years ago

Fixed in #38