mesos / kafka

Apache Kafka on Apache Mesos
Apache License 2.0
414 stars 140 forks source link

Allow scheduler to pass MESOS_NATIVE_LIBRARY to the agents running jobs #124

Open SEJeff opened 8 years ago

SEJeff commented 8 years ago

Before starting the scheduler, I'm exporting $MESOS_NATIVE_LIBRARY as the path to the mesos library on the framework machine and all mesos agent machines.

This is not honored and when it tries to start a broker, it errors out with:

I0921 13:36:35.207269 32407 logging.cpp:172] INFO level logging started!
I0921 13:36:35.208478 32407 fetcher.cpp:214] Fetching URI 'http://framework.serv.er:7100/jar/kafka-mesos-0.9.1.4.jar'
I0921 13:36:35.208498 32407 fetcher.cpp:125] Fetching URI 'http://framework.serv.er:7100/jar/kafka-mesos-0.9.1.4.jar' with os::net
I0921 13:36:35.208509 32407 fetcher.cpp:135] Downloading 'http://framework.serv.er:7100/jar/kafka-mesos-0.9.1.4.jar' to '/data/mesos/slave/slaves/20150819-163543-2433515527-5050-5382-S5/frameworks/20150908-152034-2450292743-5050-11638-0001/executors/broker-1-0dc40de4-5903-4772-84c3-0744b7f8f351/runs/17d2a042-f51a-400f-9d59-251f0cb86c6c/kafka-mesos-0.9.1.4.jar'
I0921 13:36:35.819934 32407 fetcher.cpp:214] Fetching URI 'http://framework.serv.er:7100/kafka/kafka-0.8.2.1.tgz'
I0921 13:36:35.819967 32407 fetcher.cpp:125] Fetching URI 'http://framework.serv.er:7100/kafka/kafka-0.8.2.1.tgz' with os::net
I0921 13:36:35.819990 32407 fetcher.cpp:135] Downloading 'http://framework.serv.er:7100/kafka/kafka-0.8.2.1.tgz' to '/data/mesos/slave/slaves/20150819-163543-2433515527-5050-5382-S5/frameworks/20150908-152034-2450292743-5050-11638-0001/executors/broker-1-0dc40de4-5903-4772-84c3-0744b7f8f351/runs/17d2a042-f51a-400f-9d59-251f0cb86c6c/kafka-0.8.2.1.tgz'
I0921 13:36:36.572461 32407 fetcher.cpp:78] Extracted resource '/data/mesos/slave/slaves/20150819-163543-2433515527-5050-5382-S5/frameworks/20150908-152034-2450292743-5050-11638-0001/executors/broker-1-0dc40de4-5903-4772-84c3-0744b7f8f351/runs/17d2a042-f51a-400f-9d59-251f0cb86c6c/kafka-0.8.2.1.tgz' into '/data/mesos/slave/slaves/20150819-163543-2433515527-5050-5382-S5/frameworks/20150908-152034-2450292743-5050-11638-0001/executors/broker-1-0dc40de4-5903-4772-84c3-0744b7f8f351/runs/17d2a042-f51a-400f-9d59-251f0cb86c6c'
Failed to load native Mesos library from /usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
Exception in thread "main" java.lang.UnsatisfiedLinkError: no mesos in java.library.path
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1865)
    at java.lang.Runtime.loadLibrary0(Runtime.java:870)
    at java.lang.System.loadLibrary(System.java:1122)
    at org.apache.mesos.MesosNativeLibrary.load(MesosNativeLibrary.java:54)
    at org.apache.mesos.MesosNativeLibrary.load(MesosNativeLibrary.java:79)
    at org.apache.mesos.MesosExecutorDriver.<clinit>(MesosExecutorDriver.java:49)
    at ly.stealth.mesos.kafka.Executor$.main(Executor.scala:138)
    at ly.stealth.mesos.kafka.Executor.main(Executor.scala)

It is looking for libmesos.so under /usr/lib64, but the mesos package I'm using doesn't install libmesos.so, only libmesos.so.$VERSION ie: libmesos.so.22.

The framework should honor the MESOS_NATIVE_LIBRARY env variable if available. My workaround was to symlink /usr/lib64/libmesos.so.22 --> /usr/lib64/libmesos.so, but it seems like this is a reasonable thing to support.

dmitrypekar commented 8 years ago

Hi @SEJeff, thank you for your feedback.

Probably, this is out of the scope of Scheduler. Mesos should be correctly installed on all slaves. On the environments, I've tried there is a symlink like /usr/lib/libmesos.so -> libmesos-0.25.0.so. So creating a symlink, seems to be a correct fix and not a workaround.

Regarding your solution. MESOS_NATIVE_LIBRARY should be defined on each slave, running the executor. If it is defined only on host, running the Scheduler, it will have no effect. And spreading the value of MESOS_NATIVE_LIBRARY from the Scheduler to Executors, also seems to be wrong approach, because different slaves could have different library placement (for instance, running different mesos versions, archs, OSes etc).

Please comment on this, If I understood your situation incorrectly or made a mistake. In case of silence will close the ticket in 10 days.

SEJeff commented 8 years ago

@dmitrypekar I had it defined on each slave in a file under /etc/mesos. However, there is no way for me to have the tasks reference this, so there was no way for me to define it other than a symlink.

Perhaps this feature could be changed for a way to specify env vars that each broker gets? Would get be reasonable?