KaveIO / AmbariKave

A small extension of Ambari to support KAVE services installed into a cluster
http://kave.io
Other
20 stars 5 forks source link

AmbariKave

This repository has three parts:

We also endeavor to provide an extensive wiki documentation

Relationship to Ambari

AmbariKave extends Ambari adding some more services. It does this by adding a stack to Ambari. Ambari is nicely extensible and adding a stack does not interfere with older stacks, not can it interfere with already running services.

This means there are two general ways to install these services

Installation (on the 'ambari node' of your cluster, or one large machine)

If you are looking for the extensive documentation, including descriptions of disk/cpu/ram requirements, please look at the installation wiki

( NB: the repository server uses a semi-private password only as a means of avoiding robots and reducing DOS attacks this password is intended to be widely known and is used here as an extension of the URL )

Then to provision your cluster go to: http://YOUR_AMBARI_NODE:8080 or deploy using a blueprint, see https://cwiki.apache.org/confluence/display/AMBARI/Blueprints

Installation (patch) over existing Ambari

Update our patches

If you have the head checked out from git, you can update with:

Connect to your ambari/admin node

sudo where/I/checked/out/ambari/dev/pull-update.sh

pull-update also respects git branches, as a command-line argument and is linked into the way we do automated deployment and testing

To update between released versions, simply install the new version over the old version after stopping the ambari server. Installing a new version of the stack, will not trigger an update of any running service. You would need to do this manually in the current state.

sudo ambari-server stop
wget http://repos:kaverepos@repos.kave.io/noarch/AmbariKave/3.5-Beta/ambarikave-installer-3.5-Beta.sh
sudo bash ambarikave-installer-3.5-Beta.sh

( NB: the repository server uses a semi-private password only as a means of avoiding robots and reducing DOS attacks this password is intended to be widely known and is used here as an extension of the URL )

Installation of a full cluster

If you are looking for the extensive documentation, including descriptions of disk/cpu/ram requirements, please look at the installation wiki

If you have taken the released version, go to http://YOUR_AMBARI_NODE:8080 or deploy using a blueprint, see https://cwiki.apache.org/confluence/display/AMBARI/Blueprints If you have git access, and are working from the git version, See the wiki.

We really recommend installation beginning from a blueprint, but first one must carfully design the blueprint and/or test on some other test resource. The web interface is great for single one-time custom installations, a blueprint is good for pre-tested redeployable installations.

Installation Kerberization with FreeIPA

FreeIPA can provide all necessary keytabs for your kerberized cluster, using the kerberos.csv given by the Ambari wizard. Be careful because you need to pause while using the wizard when given the option to download the csv, and do some things on the command line before continuing.

You can follow the tutorial here: https://youtu.be/hL1yiMlgg0E

Kerberizing Cluster

And/or follow these steps:

The createkeytabs.py script creates all necessary service and user principals, any missing local users or groups, creates temporary keytabs on the ambari node, copies them to the required places on the nodes, removes the local intermediate files, and tests that the new ketyabs work for those services.

Deployment tools

See the deployment subdirectory, or the deployment tarball kept separately

Downloading deployment tools

yum -y install wget curl tar zip unzip gzip python
wget http://repos:kaverepos@repos.kave.io/noarch/AmbariKave/3.5-Beta/ambarikave-deployment-3.5-Beta.tar.gz
tar -xzf ambarikave-deployment-3.5-Beta.tar.gz

Or download the head from github. See the github readme on the deployment tools, the help written for each tool, or better yet, contact us if you'd like some advice on how to use anything here. Deployment readme

Internet during installation, firewalls and nearside cache/mirror options

Ideally all of your nodes will have access to the internet during installation in order to download software.

If this is not the case, you can, possibly, implement a near-side cache/mirror of all required software. This is not very easy, but once it is done one time, you can keep it for later.

To setup a local near-side cache for the KAVE tool stack is quite easy. First either copy the entire repository website to your own internal apache server, or copy the contents of the directories to your own shared directory visible from every node.

mkdir -p /my/shared/dir
cd  /my/shared/dir
wget -R http://repos.kave.io/

Then create a /etc/kave/mirror file on each node with the new top-level directory to try first before looking for our website:

echo "/my/shared/dir" >> /etc/kave/mirror
echo "http://my/local/apache/mirror" >> /etc/kave/mirror

So long as the directory structure of the nearside cache is identical to our website, you can drop, remove or replace, any local packages you will never install from this directory structure, and update it as our repo server updates.

VPN Access

OpenVPN can be installed and setup on the desired node(s) by running the below command:

wget https://git.io/vpn -O openvpn-install.sh && bash openvpn-install.sh

This is an interactive OpenVPN installation and administration tool.

Versioning System

Read more here about kave versioning: https://github.com/KaveIO/AmbariKave/wiki/kave-versioning

Relationship with Ambari Stacks

KAVE extends a HDP stack, adding additional services. See the versioning diagram on our wiki for details.

The HDP stack number looks like X.Y, with a major and minor version. The KAVE also has an W.Z versioning scheme, but this is not 100% coupled to the HDP stack.

KAVE tags

A KAVE official version tag appears like:

The tag is split into four parts:

What consititues a major version change?

A new major version is started whenever changes of the following type are made:

KAVE stack in Ambari

We currently name our stack within ambari to reflect both the version of the HDP stack we depend on, and the installed version of the KAVE.

This is the stack name you will see in blueprints and in the ambari web interface. In older KAVE versions we used a different approach, not including the KAVE stack tag.

Configuring Kave provided services over SSL