This is a Bosh release for building Big Data Infrastructures with Apache Spark. This bosh release builds on Insightfactory Bosh release in order to provide a standalone release for Apache Spark.
Apache Spark 1.6.0 with HDFS (Hadoop 2.6) - spark-boshrelease-7.yml Apache Spark 1.5.2 with HDFS (Hadoop 2.6) - spark-boshrelease-6.yml Apache Spark 1.4.1 with HDFS (Hadoop 2.6) <= spark-boshrelease-6.yml
Take a look at the manifests directory for sample deployment manifests. Edit required variables with erb tags.
To build:
git clone https://github.com/murraju/spark-boshrelease
cd spark-boshrelease
bosh create release
bosh upload release
bosh deployment sample_manifest.yml
bosh -n deploy
.bosh vms
to list VMs and IPs.Create a config/private.yml
file with the following contents:
---
blobstore:
s3:
access_key_id: ACCESS
secret_access_key: PRIVATE
bucke_name: BUCKET
bosh create release
# To test
git commit -m "updated spark release"
bosh create release --final
git commit -m "creating vN release"
git tag vN
git push origin master --tags
a. JDK 6 for pySpark b. JDK 8 testing
This is a development release and a work in progress. Please log issues. This release has been tested against AWS EC2 and OpenStack BOSH CPIs (Cloud Provider Interface).
© 2014, Murali Raju murali.raju@appliv.com @murraju