murraju / spark-boshrelease

Cloud Foundry BOSH release for Apache Spark
Apache License 2.0
0 stars 5 forks source link

Bosh release for Apache Spark

This is a Bosh release for building Big Data Infrastructures with Apache Spark. This bosh release builds on Insightfactory Bosh release in order to provide a standalone release for Apache Spark.

Supported Version

Apache Spark 1.6.0 with HDFS (Hadoop 2.6) - spark-boshrelease-7.yml Apache Spark 1.5.2 with HDFS (Hadoop 2.6) - spark-boshrelease-6.yml Apache Spark 1.4.1 with HDFS (Hadoop 2.6) <= spark-boshrelease-6.yml

Dependencies

Zookeeper Bosh release

Usage

Take a look at the manifests directory for sample deployment manifests. Edit required variables with erb tags.

To build:

  1. Run git clone https://github.com/murraju/spark-boshrelease
  2. cd spark-boshrelease
  3. Run bosh create release
  4. Run bosh upload release
  5. Run bosh deployment sample_manifest.yml
  6. Run bosh -n deploy.
  7. Run bosh vms to list VMs and IPs.

Create new final release

Create a config/private.yml file with the following contents:

---
blobstore:
  s3:
    access_key_id:     ACCESS
    secret_access_key: PRIVATE
    bucke_name: BUCKET
bosh create release
# To test
git commit -m "updated spark release"
bosh create release --final
git commit -m "creating vN release"
git tag vN
git push origin master --tags

TODO

a. JDK 6 for pySpark b. JDK 8 testing

Disclaimer

This is a development release and a work in progress. Please log issues. This release has been tested against AWS EC2 and OpenStack BOSH CPIs (Cloud Provider Interface).

Copyright and Credits

© 2014, Murali Raju murali.raju@appliv.com @murraju