sequenceiq / hadoop-docker

Hadoop docker image
https://registry.hub.docker.com/u/sequenceiq/hadoop-docker/
Apache License 2.0
1.21k stars 560 forks source link

[WARN] Unable to load native-hadoop library for your platform #47

Open kasured opened 8 years ago

kasured commented 8 years ago

Even though Dockerfile contains "fix" for native library warning, I am still able to see the warning when running out of the box

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [2d6335dea9d7] _Host: 3.10.0-229.20.1.el7.x8664 CentOS Linux release 7.2.1511 (Core)

bash-4.1# ldd /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 /usr/local/hadoop/lib/native/libhadoop.so.1.0.0: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/local/hadoop/lib/native/libhadoop.so.1.0.0) linux-vdso.so.1 => (0x00007ffd058ab000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fdbce779000) libjvm.so => not found libc.so.6 => /lib64/libc.so.6 (0x00007fdbce3e4000) /lib64/ld-linux-x86-64.so.2 (0x00007fdbceba3000)

bash-4.1# file /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 /usr/local/hadoop/lib/native/libhadoop.so.1.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped

Hokan-Ashir commented 7 years ago

Have to add, that it become an error, if you try to use some io compression library like Snappy of zlib. As error says libhadoop wasn't build with their support. Moreover due to libhadoop requires glibc 2.4+, as kasured mentioned and this docker image uses CentOS 6.5 you CAN'T upgrade it more that 2.2.

There can be multiple solutions:

  1. build glibc 2.4 alone and change JAVA_LIBRARIES variable, then restart Hadoop cluster (via stop-all.sh/start-all.sh i.e. despite this scrips are deprecated), to set up new glibc for Hadoop. Or use LD_PRELOAD to do same things. Unfortunately, I tried to compile it via this tutorial (http://www.imperx.com/wp-content/uploads/Member/Cameras/Bobcat_Gen2/GEV%20Linux/Workaround_to_install_GLIBC_2.14_to_CentOS_6.7.pdf) and have no success
  2. build hadoop itself from sources, downloading sources from Apache site. Didn't tried, but you may have success, using this tutorials - https://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/NativeLibraries.html, http://www.linuxsecrets.com/blog/46performance-tips/2015/05/18/1485-compiling-apache-hadoop-64bit-version-with-compression-support-for-linux. This guide WONT work - http://www.ericlin.me/enabling-snappy%EF%BC%8Dcompression-in-hadoop-2-4-under-centos-6-3
  3. build own docker image with "FROM centos:centos7" tag to get glibc 2.4+. However this way may give you A LOT of pain - you have to play with so called supervisor (https://docs.docker.com/engine/admin/using_supervisord/). Unless you install it, you'll face "Failed to get D-Bus connection" error (see https://github.com/docker/docker/issues/7459). Cause you can't run daemons processes, like sshd, in docker container or during docker image creation. Notice, that current docker image in this repo have commented string about how to install supervisor. It has minor error though - replace RUN curl https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -o - | python with RUN curl https://bitbucket.org/pypa/setuptools/downloads/ez_setup.py -o - | python and ADD supervisord.conf /etc/supervisord.conf with RUN echo_supervisord_conf > /etc/supervisord.conf

Despite all this, I have no success building stable working docker container - have to comment these lines:

RUN service sshd start && $HADOOP_PREFIX/etc/hadoop/hadoop-env.sh && $HADOOP_PREFIX/sbin/start-dfs.sh && $HADOOP_PREFIX/bin/hdfs dfs -mkdir -p /user/root
RUN service sshd start && $HADOOP_PREFIX/etc/hadoop/hadoop-env.sh && $HADOOP_PREFIX/sbin/start-dfs.sh && $HADOOP_PREFIX/bin/hdfs dfs -put $HADOOP_PREFIX/etc/hadoop/ input

and my docker can't up DataNodes and NameNodes. BUT after start from such prepared container you can type yum -y install snappy snappy-devel and get Snappy support

// take a look at HDP versions, they are heavy, immense, but have no such little troubles // have to admit, this docker image suits me well, just until native library support issue

Hokan-Ashir commented 7 years ago

By the way, you can use forked version of this repo, which has all native libraries support - https://github.com/sfedyakov/hadoop-271-cluster Also it has no problems with creating docker image via supervisor, cause it simply dont' use it