cloudera / clusterdock

Apache License 2.0
70 stars 57 forks source link

Exception('Failed to start cluster.') #8

Closed kevintrannz closed 8 years ago

kevintrannz commented 8 years ago

Hi, We have same issue and wondering is anyone looking at this issue at all? Please review comments in https://hub.docker.com/r/cloudera/clusterdock

clusterdock_run ./bin/start_cluster cdh INFO:clusterdock.cluster:Successfully started node-2.cluster (IP address: 192.168.123.5). INFO:clusterdock.cluster:Successfully started node-1.cluster (IP address: 192.168.123.4). INFO:clusterdock.cluster:Started cluster in 6.69 seconds. INFO:clusterdock.topologies.cdh.actions:Changing server_host to node-1.cluster in /etc/cloudera-scm-agent/config.ini... INFO:clusterdock.topologies.cdh.actions:Restarting CM agents... cloudera-scm-agent is already stopped Starting cloudera-scm-agent: [ OK ] Stopping cloudera-scm-agent: [ OK ] Starting cloudera-scm-agent: [ OK ] INFO:clusterdock.topologies.cdh.actions:Waiting for Cloudera Manager server to come online... INFO:clusterdock.topologies.cdh.actions:Detected Cloudera Manager server after 73.14 seconds. INFO:clusterdock.topologies.cdh.actions:CM server is now accessible at http://moby:32769 INFO:clusterdock.topologies.cdh.cm:Detected CM API v13. INFO:clusterdock.topologies.cdh.cm_utils:Updating database configurations... INFO:clusterdock.topologies.cdh.cm:Updating NameNode references in Hive metastore... WARNING:clusterdock.topologies.cdh.cm:Failed to update NameNode references in Hive metastore (command returned : 'HiveUpdateLocationServiceCommand' (id: 233; active: False; success: False)). INFO:clusterdock.topologies.cdh.actions:Deploying client configuration... INFO:clusterdock.topologies.cdh.actions:Starting cluster...

Traceback (most recent call last): File "./bin/start_cluster", line 70, in main() File "./bin/start_cluster", line 63, in main actions.start(args) File "/root/clusterdock/clusterdock/topologies/cdh/actions.py", line 146, in start raise Exception('Failed to start cluster.')

Thanks, Kevin.

dimaspivak commented 8 years ago

Hi Kevin,

Can you give me some more details about the machine on which you're trying to run clusterdock? CPUs and amount of available RAM would be useful to know.

Thanks, Dima

kevintrannz commented 8 years ago

Hi Dima, I try to run clusterdock in:

Client: Version: 1.12.1 API version: 1.24 Go version: go1.6.3 Git commit: 23cf638 Built: Thu Aug 18 05:33:38 2016 OS/Arch: linux/amd64

Server: Version: 1.12.1 API version: 1.24 Go version: go1.6.3 Git commit: 23cf638 Built: Thu Aug 18 05:33:38 2016 OS/Arch: linux/amd64

Thanks, Kevin.

dimaspivak commented 8 years ago

Hm, which Linux distribution are you running in? And is this in a VM or directly on a host?

kevintrannz commented 8 years ago

clusterdock runs on native Ubuntu 16.04 LTS 64bit

dimaspivak commented 8 years ago

Hm, interesting. So the error you're seeing is coming from Cloudera Manager, which we call from clusterdock. As such, the simplest way to get to the bottom of it would be to log into CM via the hostname:port output to console during start up and then look at the recently run commands and see what the logs say about why they failed. Wanna give this a shot and report back? FWIW, errors like that tend to suggest a lack of sufficient available system resources.

amarouni commented 8 years ago

@kevintvh @dimaspivak I had the same issue with the exact same setup (Ubuntu 16.04, ...) and as @dimaspivak suggested it was due to lack of system resources. I find 2 quick workarounds, start the required services only or enable swap and start the whole thing. HTH.