GoogleCloudDataproc / bdutil

[DEPRECATED] Script used to manage Hadoop and Spark instances on Google Compute Engine
https://cloud.google.com/dataproc
Apache License 2.0
109 stars 94 forks source link

Deploy HDP 2.3 with bdutil #36

Open rmetzger opened 9 years ago

rmetzger commented 9 years ago

Hi, I'm trying to deploy HDP 2.3 with bdutil.

I've set these configuration values

AMBARI_REPO="http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.0.1/ambari.repo"
AMBARI_STACK_VERSION='2.3'

in ambari.conf.

The deployment fails with

Mon Jun 29 10:26:52 CEST 2015: Invoking on master: ./install-ambari-components.sh
.Mon Jun 29 10:26:53 CEST 2015: Waiting on async 'ssh' jobs to finish. Might take a while...
Mon Jun 29 10:26:56 CEST 2015: Exited 1 : gcloud --project=XXXXXXX --quiet --verbosity=info compute ssh hdp22-m --command=sudo su -l -c "cd ${PWD} && ./install-ambari-components.sh" 2>>install-ambari-components_deploy.stderr 1>>install-ambari-components_deploy.stdout --ssh-flag=-tt --ssh-flag=-oServerAliveInterval=60 --ssh-flag=-oServerAliveCountMax=3 --ssh-flag=-oConnectTimeout=30 --zone=europe-west1-d
Mon Jun 29 10:26:56 CEST 2015: Fetching on-VM logs from hdp22-m
INFO: Refreshing access_token
Warning: Permanently added '130.211.X.X' (RSA) to the list of known hosts.
Mon Jun 29 10:26:57 CEST 2015: Command failed: wait ${SUBPROC} on line 311.
Mon Jun 29 10:26:57 CEST 2015: Exit code of failed command: 1
Mon Jun 29 10:26:57 CEST 2015: Detailed debug info available in file: /tmp/bdutil-20150629-102300-GFV/debuginfo.txt
Mon Jun 29 10:26:57 CEST 2015: Check console output for error messages and/or retry your command.

/tmp/bdutil-20150629-102300-GFV/debuginfo.txt contains:

hdp22-m:        ==> install-ambari-components_deploy.stderr <==
hdp22-m:        Traceback (most recent call last):
hdp22-m:          File "<string>", line 1, in <module>
hdp22-m:        ImportError: No module named argparse
hdp22-m:        Traceback (most recent call last):
hdp22-m:          File "./create_blueprint.py", line 122, in <module>
hdp22-m:            main()
hdp22-m:          File "./create_blueprint.py", line 118, in main
hdp22-m:            args.custom_configuraton)
hdp22-m:          File "./create_blueprint.py", line 76, in create_blueprints
hdp22-m:            configuration_recommendation = json.load(conf_recommendation_file)
hdp22-m:          File "/usr/lib64/python2.6/json/__init__.py", line 267, in load
hdp22-m:            parse_constant=parse_constant, **kw)
hdp22-m:          File "/usr/lib64/python2.6/json/__init__.py", line 307, in loads
hdp22-m:            return _default_decoder.decode(s)
hdp22-m:          File "/usr/lib64/python2.6/json/decoder.py", line 319, in decode
hdp22-m:            obj, end = self.raw_decode(s, idx=_w(s, 0).end())
hdp22-m:          File "/usr/lib64/python2.6/json/decoder.py", line 338, in raw_decode
hdp22-m:            raise ValueError("No JSON object could be decoded")
hdp22-m:        ValueError: No JSON object could be decoded
hdp22-m:        .

Does the hdp module in bdutil support HDP 2.3 ?

desaiak commented 9 years ago

Were you able to get 2.3 running using bdutil ? I tried going Ambari 2.1 - HDP 2.3 route and it worked with some manual work at the end.

Changed the following in ambari.conf AMBARI_VERSION='2.1.0-1409' AMBARI_REPO="http://s3.amazonaws.com/dev.hortonworks.com/ambari/centos6/2.x/BUILDS/${AMBARI_VERSION}/ambaribn.repo" AMBARI_STACK='HDP' AMBARI_STACK_VERSION='2.3'

The install failed

Sun Jul 19 14:57:08 EDT 2015: Exited 1 : gcloud --project=iron-potion-771 --quiet --verbosity=info compute ssh hadoop-m --command=sudo su -l -c "cd ${PWD} && ./update-ambari-config.sh" 2>>update-ambari-config_deploy.stderr 1>>update-ambari-config_deploy.stdout --ssh-flag=-tt --ssh-flag=-oServerAliveInterval=60 --ssh-flag=-oServerAliveCountMax=3 --ssh-flag=-oConnectTimeout=30 --zone=us-central1-a Sun Jul 19 14:57:08 EDT 2015: Fetching on-VM logs from hadoop-m Warning: Permanently added '104.197.109.200' (RSA) to the list of known hosts. Sun Jul 19 14:57:11 EDT 2015: Command failed: wait ${SUBPROC} on line 326. Sun Jul 19 14:57:11 EDT 2015: Exit code of failed command: 1 Sun Jul 19 14:57:11 EDT 2015: Detailed debug info available in file: /tmp/bdutil-20150719-143917-aAi/debuginfo.txt Sun Jul 19 14:57:11 EDT 2015: Check console output for error messages and/or retry your command.

Error file ended with this ..

hadoop-m:tambari_wait status: INPROGRESS hadoop-m:tambari_wait status: INPROGRESS hadoop-m:tambari_wait status: INPROGRESS hadoop-m:tambari_wait status: INPROGRESS hadoop-m:t-test: Fatal internal error hadoop-m:tjava.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found hadoop-m:t at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195) hadoop-m:t at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2638) hadoop-m:t at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651) hadoop-m:t at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92) hadoop-m:t at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687) hadoop-m:t at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669) hadoop-m:t at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371) hadoop-m:t at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) hadoop-m:t at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) hadoop-m:t at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235) hadoop-m:t at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218) hadoop-m:t at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201) hadoop-m:t at org.apache.hadoop.fs.shell.Command.run(Command.java:165) hadoop-m:t at org.apache.hadoop.fs.FsShell.run(FsShell.java:287) hadoop-m:t at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) hadoop-m:t at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) hadoop-m:t at org.apache.hadoop.fs.FsShell.main(FsShell.java:340) hadoop-m:tCaused by: java.lang.ClassNotFoundException: Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found hadoop-m:t at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) hadoop-m:t at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193) hadoop-m:t ... 16 more hadoop-m:tError: There was a problem accessing your configuration bucket using the GCS hadoop-m:tconnector. Check configuration files. Also make sure have the GCS JSON API hadoop-m:tenabled as described at https://developers.google.com/storage/docs/json_api/. hadoop-m:t.

However the JSON API was fine .. but the hadoop-env.sh did not have the class path update to add location of the gcs connector jar.

Added the blow to hadoop env template in ambari config guy

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/lib/hadoop/lib/gcs-connector-1.4.1-hadoop2.jar

Seems like everything is working but I guess I might need some more storage soon :)

hadoop fs -df -h gs://hadoop1213 15/07/19 19:48:22 INFO gcs.GoogleHadoopFileSystemBase: GHFS version: 1.4.1-hadoop2 Filesystem Size Used Available Use% gs://hadoop1213/ 8.0 E 0 8.0 E 0%

lkoankit commented 9 years ago

Were you guys able to run hdp 2.3 with the gdutil utility ??

can you please share the ambari configuration to do so ??