mesosphere / chronos-pkg

Apache License 2.0
5 stars 16 forks source link

Ubuntu 12.04 Running chronos service or /usr/bin/chronos fails with "Failed to load native Mesos library" with libmesos.so accounted for #13

Open goodwillcoding opened 8 years ago

goodwillcoding commented 8 years ago
/usr/bin/chronos 
+ cmd=(run_jar)
+ local cmd
+ [[ -s /etc/mesos/zk ]]
+ cmd+=(--zk_hosts "$(cut -d / -f 3 /etc/mesos/zk)" --master "$(cat /etc/mesos/zk)")
++ cut -d / -f 3 /etc/mesos/zk
++ cat /etc/mesos/zk
+ [[ -d /etc/chronos/conf ]]
+ read -u 9 -r -d '' path
++ cd /etc/chronos/conf
++ find . -type f -not -name '.*' -print0
+ local name=http_port
+ element_in --http_port
+ local e
+ return 1
+ case "$name" in
+ cmd+=("--$name" "$(< "$conf_dir/$name")")
+ read -u 9 -r -d '' path
+ logged chronos run_jar --zk_hosts localhost:2181 --master zk://localhost:2181/mesos --http_port 4400
+ local 'token=chronos[5563]'
+ shift
+ exec
+ exec
++ exec logger -p user.info -t 'chronos[5563]'
Aug  2 04:46:50 precise64 chronos[5563]: java -Djava.library.path=/usr/local/lib:/usr/lib64:/usr/lib -Djava.util.logging.SimpleFormatter.format=%2$s %5$s%6$s%n -Xmx512m -cp /usr/bin/chronos org.apache.mesos.chronos.scheduler.Main --zk_hosts localhost:2181 --master zk://localhost:2181/mesos --http_port 4400
++ exec logger -p user.notice -t 'chronos[5563]'
Aug  2 04:46:50 precise64 chronos[5563]: + run_jar --zk_hosts localhost:2181 --master zk://localhost:2181/mesos --http_port 4400
Aug  2 04:46:50 precise64 chronos[5563]: + local 'log_format=%2$s %5$s%6$s%n'
Aug  2 04:46:50 precise64 chronos[5563]: ++ ulimit -n
Aug  2 04:46:50 precise64 chronos[5563]: + '[' 0 -eq 0 -a 1024 -lt 8192 ']'
Aug  2 04:46:50 precise64 chronos[5563]: + ulimit -n 8192
Aug  2 04:46:50 precise64 chronos[5563]: + export PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/vagrant_ruby/bin
Aug  2 04:46:50 precise64 chronos[5563]: + PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/vagrant_ruby/bin
Aug  2 04:46:50 precise64 chronos[5563]: + vm_opts=(-Djava.library.path=/usr/local/lib:/usr/lib64:/usr/lib -Djava.util.logging.SimpleFormatter.format="$log_format")
Aug  2 04:46:50 precise64 chronos[5563]: + local vm_opts
Aug  2 04:46:50 precise64 chronos[5563]: + for j_opt in '${JAVA_OPTS:-"-Xmx512m"}'
Aug  2 04:46:50 precise64 chronos[5563]: + vm_opts+=(${j_opt})
Aug  2 04:46:50 precise64 chronos[5563]: + exec java -Djava.library.path=/usr/local/lib:/usr/lib64:/usr/lib '-Djava.util.logging.SimpleFormatter.format=%2$s %5$s%6$s%n' -Xmx512m -cp /usr/bin/chronos org.apache.mesos.chronos.scheduler.Main --zk_hosts localhost:2181 --master zk://localhost:2181/mesos --http_port 4400
Aug  2 04:46:50 precise64 chronos[5563]: [2015-08-02 04:46:50,924] INFO --------------------- (org.apache.mesos.chronos.scheduler.Main$:26)
Aug  2 04:46:50 precise64 chronos[5563]: [2015-08-02 04:46:50,929] INFO Initializing chronos. (org.apache.mesos.chronos.scheduler.Main$:27)
Aug  2 04:46:50 precise64 chronos[5563]: [2015-08-02 04:46:50,935] INFO --------------------- (org.apache.mesos.chronos.scheduler.Main$:28)
Aug  2 04:46:54 precise64 chronos[5563]: [2015-08-02 04:46:54,198] INFO Wiring up the application (org.apache.mesos.chronos.scheduler.config.MainModule:38)
Aug  2 04:46:54 precise64 chronos[5563]: Failed to load native Mesos library from /usr/local/lib:/usr/lib64:/usr/lib
Aug  2 04:46:54 precise64 chronos[5563]: [2015-08-02 04:46:54,760] INFO Shutting down services (org.apache.mesos.chronos.scheduler.Main$:42)
Aug  2 04:46:54 precise64 chronos[5563]: [2015-08-02 04:46:54,762] INFO Waiting for services to shut down (org.apache.mesos.chronos.scheduler.Main$:48)

libmesos.so is accounted for in the path /usr/lib

lrwxrwxrwx 1 root root 18 Jul 24 10:16 /usr/lib/libmesos.so -> libmesos-0.23.0.so

Tried setting MESOS_NATIVE_JAVA_LIBRARY to no avail.

Also tried running with java directly, same problem

java -Djava.library.path=/usr/local/lib:/usr/lib64:/usr/lib '-Djava.util.logging.SimpleFormatter.format=%2$s %5$s%6$s%n' -Xmx512m -cp /usr/bin/chronos org.apache.mesos.chronos.scheduler.Main --zk_hosts localhost:2181 --master zk://localhost:2181/mesos --http_port 4400

The server is a Ubuntu 12.04 amd64, fresh box

Correction, Java version:

java version "1.6.0_35"
OpenJDK Runtime Environment (IcedTea6 1.13.7) (6b35-1.13.7-1ubuntu0.12.04.2)

OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mo

goodwillcoding commented 8 years ago

I built a test class to load mesos

Exception in thread "main" java.lang.UnsatisfiedLinkError: /usr/lib/libmesos-0.23.0.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.18' not found (required by /usr/lib/libmesos-0.23.0.so)

Which is strange since

ldd /usr/lib/libmesos-0.23.0.so | grep std
    libstdc++.so.6 => /usr/lib/mesos/libstdc++.so.6 (0x00007f25dcfb6000
goodwillcoding commented 8 years ago

So putting LD_PRELOAD=/usr/lib/mesos/libstdc++.so.6 before running it starts it all up. Still not sure why mesos is looking at the system libstdc++

goodwillcoding commented 8 years ago

Also this seems to be relevant: https://stackoverflow.com/questions/13952189/how-to-get-rid-of-ld-preload-when-using-jni

If a statically linked library is used, it sounds like it should be preloaded

goodwillcoding commented 8 years ago

I am still not sure if this is a mesos problem or a chronos packaging issue

goodwillcoding commented 8 years ago

bump! anyone?

stevenschlansker commented 8 years ago

It looks like your /usr/lib/mesos/libstdc++.so.6 is not in the searched library path from Java (which claims to be /usr/local/lib:/usr/lib64:/usr/lib, maybe that is a problem?

goodwillcoding commented 8 years ago

bump ... anyone?

goodwillcoding commented 8 years ago

@stevenschlansker ... yeah I tried adding that path to -D and it makes no difference

goodwillcoding commented 8 years ago

-Djava.libary.path that is

grampelberg commented 8 years ago

@goodwillcoding it looks like a chronos packaging issue to me. Ubuntu 12.04 is pretty old, have you tried on 14.04?

lloesche commented 8 years ago

To give some background, Mesos 0.23+ only compiles with GCC 4.8+. To still support Ubuntu 12.04 LTS in Mesos 0.23 we compile it with a recent version of gcc, set LD_RUN_PATH=/usr/lib/mesos during linking and ship the libstdc++ compiler runtime library with the Ubuntu 12.04 mesos deb package.

The commit that is responsible for that in mesos-deb-packaging is: https://github.com/mesosphere/mesos-deb-packaging/commit/1185171304b2555270a25f1a682abade83b195b9

And in our build job you'll find a

cp -f /usr/lib/gcc/x86_64-linux-gnu/4.8/libstdc++.so /usr/lib/mesos/libstdc++.so.6
./build_mesos --repo {{repo}} --nominal-version {{version}} --extra-libs /usr/lib/mesos/libstdc++.so.6 --build-version {{build_version}}.{{distro_tag}}

Now the bad news is I'm not quite sure yet why when used via JNI it doesn't correctly resolve the library. As you already found out yourself ldd correctly prints the dependency. I will have to ask our Scala/Java devs to debug this issue further.

@pyronicide on Ubuntu 14.04 the problem doesn't exist because there we have native GCC 4.8+ support. It's 12.04 specific and I would expect to also see it in Marathon. I'll update here once I find out more.

goodwillcoding commented 8 years ago

@lloesche ... that was my conclusion as well ... so glad I was not way off

goodwillcoding commented 8 years ago

@lloesche thank you for looking at this