mesos / storm

Storm on Mesos!
Apache License 2.0
138 stars 66 forks source link

Invalid host resolution (Nimbus, Docker, Marathon) #174

Open ilya-bystrov opened 7 years ago

ilya-bystrov commented 7 years ago

I run Storm on Mesos via Docker.

Supervisor can not be started because Nimbus service resolves hostname as localhost.

nimbus: [INFO] Started HTTP server from which config for the MesosSupervisor's may be fetched. URL: http://localhost:45762/generated-conff

After that fetcher failed. mesos-slave: I1020 13:38:45.664760 32717 fetcher.cpp:134] Downloading resource from 'http://localhost:45762/generated-conf/storm.yaml' to '/var/lib/mesos/slaves/27978124-3a65-46d5-80c0-cc83871de3f2-S0/frameworks/27978124-3a65-46d5-80c0-cc83871de3f2-0002/executors/inout-1-1476891187/runs/52d08df3-3152-4c6b-b680-4c0d80584c5d/storm.yaml' Failed to fetch 'http://localhost:45762/generated-conf/storm.yaml': Error downloading resource: Couldn't connect to server

Also I find code fragment responsible for host resolution storm.mesos.util.MesosCommon#getNimbusHost

Is there any solution? or I missed some configuration details.

erikdw commented 7 years ago

@ilya-bystrov : try setting nimbus.host in storm.yaml to your Nimbus's IP. I improved that behavior in PR #167.

rwd5213 commented 7 years ago

@erikdw I am having the same error even when I specify the nimbus host both in the conf file and as an argument in the storm command . this is the command I am running ./bin/storm jar -c nimbus.host=10.0.130.54 -c nimbus.thrift.port=12078 examples/storm-starter/storm-starter-topologies-0.10.1.jar storm.starter.WordCountTopology word-count

erikdw commented 7 years ago

@rwd5213 : that cmd you pasted isn't starting the nimbus, it's just submitting a topology jar to the Nimbus to be run in the cluster. You need to specify the nimbus.host in the configuration of the MesosNimbus when you start that (however you do, it's very specific to your own environment).

rwd5213 commented 7 years ago

@erikdw sorry I forgot to mention that I am running it in marathon using the run-marathon-script so doesn't that start Nimbus. If not how would I specify the configuration and start the MesosNimbus

erikdw commented 7 years ago

@rwd5213 : Ah. I have never used Marathon, as mentioned in this issue about the Marathon instructions needing some polish. But the Marathon instructions in this project talk about how to set some options for the nimbus:

It is also possible to add command line parameter to both the ui and nimbus through STORM_UI_OPTS and STORM_NIMBUS_OPTS respectadly:

STORM_NIMBUS_OPTS="-c storm.local.dir=/my/mounted/volume -c topology.mesos.worker.cpu=1.5"

You should try setting the nimbus.host via those instructions.

erikdw commented 7 years ago

Oh... even though I've never used Marathon I have an inkling of its purpose and how it works -- I think it would put the Nimbus onto some arbitrary host in the cluster. So you wouldn't know a priori the IP/hostname to put into the nimbus.host config option. Hence I'm not sure how you'd get the Nimbus to advertise its actual IP out to the Mesos Executors in such a situation when running within Docker. Welcome any contributions to fix that issue...

rwd5213 commented 7 years ago

@erikdw I got it to work by setting the STORM_NIMBUS_OPTS env in the json for the marathon application. However you have to just guess what machine IP node marathon will put the container on. If you restrict it to only use slave_public nodes then that will help your guessing. Also I am assuming that after the application is running you can exec into the container and set the environment variable there however I haven't tried it yet. I am not sure about how to get it to advertise its actual IP either

ilya-bystrov commented 7 years ago

Here is some additional info: storm-mesos-0.1.7-storm0.10.1-mesos0.28.2

@erikdw : I'm using Docker. So I added "STORM_NIMBUS_OPTS=-c nimbus.host=some_node" environment in storm-nimbus Marathon configuration and it's works fine when nimbus service deployed on some_node. But it's not solve problem in general, because in this case I should care about nimbus host assignment.

Also I tried to pass MESOS_NIMBUS_HOST environment variable in marathon configuration. But this doesn't work.

ilya-bystrov commented 7 years ago

@erikdw : About host assignment.

Actually I find out that marathon pass hostname to the docker process as environment parameter: docker -H unix:///var/run/docker.sock run ... -e HOST=actual_node ... -c ./bin/run-with-marathon.sh

So I think it will be enough to add the following line in run-with-marathon.sh MESOS_NIMBUS_HOST="$HOST"

But at now it seems storm-mesos-0.1.7 doesn't respect MESOS_NIMBUS_HOST environment parameter. I suppose there is nimbus.host=localhost somewhere in default configuration. And therefore nimbus.host can't be defined via MESOS_NIMBUS_HOST.

DarinJ commented 7 years ago

So when I've done this in the past, I used mesos -dns and set the nimbus hostname to nimbus.marathon.mesos (taskid of nimbus was nimbus).

On Oct 20, 2016 6:45 PM, "Erik Weathers" notifications@github.com wrote:

Oh... even though I've never used Marathon I have an inkling of its purpose and how it works -- I think it would put the Nimbus onto some arbitrary host in the cluster. I'm not sure how you'd get the Nimbus to advertise its actual IP out to the Mesos Executors in such a situation when running within Docker. Welcome any contributions to fix that issue...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mesos/storm/issues/174#issuecomment-255248838, or mute the thread https://github.com/notifications/unsubscribe-auth/AF5PHl5KM-T3UJfDhKde35iYJT4lnezhks5q1-8IgaJpZM4KcWIq .

erikdw commented 7 years ago

@ilya-bystrov : we (my team at my company) have added a ticket into our next sprint to try to get marathon working with this project, since we have never used it before. Hopefully that will allow us to answer some of these Qs.