ConPaaS-team / conpaas

ConPaaS: integrated runtime environment for elastic cloud applications
http://www.conpaas.eu
BSD 3-Clause "New" or "Revised" License
14 stars 3 forks source link

Hadoop processes fail #49

Closed anaion closed 10 years ago

anaion commented 10 years ago

None of the Hadoop processes seems to successfully start. Agent's log:

2014-07-10 09:34:22,833 INFO conpaas.core.agent 'mapreduce' agent started (uid=1, sid=1)
2014-07-10 09:34:22,843 INFO conpaas.core.ipop Not starting a VPN: IPOP_IP_ADDRESS not found
2014-07-10 09:34:25,021 INFO conpaas.core.agent Ganglia started successfully
2014-07-10 09:34:25,036 INFO conpaas.core.agent Init done: first_node=true, mgmt_server=
2014-07-10 09:34:28,807 INFO conpaas.core.agent called startup with "true" {u'ip': u'172.16.0.130', u'private_ip': u'172.16.0.5'}
2014-07-10 09:34:28,810 DEBUG conpaas.core.agent called _write_config
2014-07-10 09:34:30,135 INFO conpaas.core.agent Hadoop configuration written.
2014-07-10 09:34:59,890 INFO conpaas.core.agent Formatted namenode: ; None
2014-07-10 09:35:12,423 INFO conpaas.core.agent Started namenode: Starting Hadoop namenode daemon: starting namenode, logging to /usr/lib/hadoop-0.20/logs/hadoop-hadoop-namenode-server-b0ded817-f1ad-4312-9dc4-1bce14a8c938.novalocal.out
ERROR. Could not start Hadoop namenode daemon
; None
2014-07-10 09:35:32,575 INFO conpaas.core.agent Started secondarynamenode: Starting Hadoop secondarynamenode daemon: starting secondarynamenode, logging to /usr/lib/hadoop-0.20/logs/hadoop-hadoop-secondarynamenode-server-b0ded817-f1ad-4312-9dc4-1bce14a8c938.novalocal.out
ERROR. Could not start Hadoop secondarynamenode daemon
; None
2014-07-10 09:37:06,133 INFO conpaas.core.agent set hdfs writable: ; None
2014-07-10 09:37:20,047 INFO conpaas.core.agent Started jobtracker: Starting Hadoop jobtracker daemon: starting jobtracker, logging to /usr/lib/hadoop-0.20/logs/hadoop-hadoop-jobtracker-server-b0ded817-f1ad-4312-9dc4-1bce14a8c938.novalocal.out
ERROR. Could not start Hadoop jobtracker daemon
; None
2014-07-10 09:37:44,008 INFO conpaas.core.agent Started hue: Starting Hue for Hadoop : hue.
; None
2014-07-10 09:38:08,424 INFO conpaas.core.agent Started datanode: Starting Hadoop datanode daemon: starting datanode, logging to /usr/lib/hadoop-0.20/logs/hadoop-hadoop-datanode-server-b0ded817-f1ad-4312-9dc4-1bce14a8c938.novalocal.out
ERROR. Could not start Hadoop datanode daemon
; None
2014-07-10 09:38:42,599 INFO conpaas.core.agent Started tasktracker: Starting Hadoop tasktracker daemon: starting tasktracker, logging to /usr/lib/hadoop-0.20/logs/hadoop-hadoop-tasktracker-server-b0ded817-f1ad-4312-9dc4-1bce14a8c938.novalocal.out
ERROR. Could not start Hadoop tasktracker daemon
; None
2014-07-10 09:38:42,701 INFO conpaas.core.agent Agent is running

All the processes fail with the same error:

2014-07-10 10:24:54,660 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:mapred (auth:SIMPLE) cause:java.io.IOException: File /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1

Ana

gpierre42 commented 10 years ago

@tschuett : could you please take a look into this? Thanks!

tcrivat commented 10 years ago

I was never able to reproduce this issue with the current ConPaaS version. The Hadoop service starts fine on both OpenNebula and Amazon EC2, adding new nodes works as well (I added as many as 3-4 additional nodes). I tested by accessing the Hadoop Distributed File System, which works.

The only problem is on the Nutshell. Creating and starting the service works (given that enough memory is added to the Nutshell VM, in this case 5 GB as discussed here #80), however starting a second agent fails (the agent starts but the service remains in the ADAPTING state). This is however a separate issue, not related to the one described above.

gpierre42 commented 10 years ago

In this case I propose to close the issue.