mesos / kafka

Apache Kafka on Apache Mesos
Apache License 2.0
414 stars 140 forks source link

Brokers refuse to launch on freshly built cluster #199

Open justizin opened 8 years ago

justizin commented 8 years ago

I have a freshly built Mesos cluster on CentOS 6.7, using latest Mesos installed and configured via community chef cookbook.

As I have tested in multiple environments, though probably all Ubuntu Trusty/14.04, I fired up the kafka-mesos scheduler, ran create for multiple brokers, and ran the start command for all these brokers. Not clear if CentOS is a factor or simply an incidental variable, but it would feel remiss not to mention this.

On the start command, kafka-mesos.sh hangs, eventually complaining of a timeout.

One thing I noticed in the output in my newer setup is that it fails to ever create a framework, and doesn't seem to be talking to the Mesos API at all. I know this Mesos environment is functional at some level, as I have no problem launching tasks with Marathon.

Output from ./kafka-mesos.sh scheduler: Loading config defaults from kafka-mesos.properties 2016-04-05 13:54:25,088 [main] INFO ly.stealth.mesos.kafka.Scheduler$ - Starting Scheduler$: debug: true, storage: zk:/mesos-kafka mesos: master=zk01:5050,zk02:5050,zk03:5050, user=marathon, principal=<none>, secret=<none> framework: name=kafka, role=*, timeout=30d api: http://zk01:7000, bind-address: <all>, zk: zk01:2181,zk02:2181,zk03:2181/kafka, jre: <none> 2016-04-05 13:54:25,226 [main] INFO org.eclipse.jetty.server.Server - jetty-9.0.z-SNAPSHOTWrappedArray() 2016-04-05 13:54:25,260 [main] INFO org.eclipse.jetty.server.handler.ContextHandler - Started WrappedArray(o.e.j.s.ServletContextHandler@3aefe5e5{/,null,AVAILABLE}) 2016-04-05 13:54:25,275 [main] INFO org.eclipse.jetty.server.ServerConnector - Started WrappedArray(ServerConnector@71b1176b{HTTP/1.1}{0.0.0.0:7000}) 2016-04-05 13:54:25,287 [main] INFO ly.stealth.mesos.kafka.HttpServer$ - started on port 7000 I0405 13:54:25.347257 9054 sched.cpp:222] Version: 0.27.1 I0405 13:54:25.350013 9099 sched.cpp:326] New master detected at master@10.100.1.158:5050 I0405 13:54:25.350318 9099 sched.cpp:336] No credentials provided. Attempting to register without authentication

I'm working to familiarize myself with this code and learn to better debug this situation, but at the raw level I'm not seeing anything obvious via tcpdump or other analysis.

FWIW, the output in my other environment looks pretty much just like this, it just happens to not stop at "No credentials provided. Attempting to register without authentication"

makayel commented 7 years ago

New master detected at master@10.100.1.158:5050

Is your elected master is actually this one ? I got the same issue with CentOS and Docker.

makayel commented 7 years ago

Try to change the scheduler's option master :

From master=zk01:5050,zk02:5050,zk03:5050

To master=zk://zk01:2181,zk02:2181,zk03:2181/mesos

Worked for me.