vyshane / cassandra-kubernetes

Easily launch a Cassandra cluster on Kubernetes
Apache License 2.0
60 stars 31 forks source link

Multi-node Cassandra Cluster on Kubernetes

Creating a Cluster

You will need to bring your own Kubernetes. A quick and easy way to setup Kubernetes locally is via Docker Compose. Once you have Kubernetes up and running:

./start-cassandra.sh

This will create a Kubernetes pod containing a single Cassandra node. You can use the cassandra-status.sh convenience script to see that the node comes up:

./cassandra-status.sh 

  C* Node      Kubernetes Pod
  -------      --------------
               NAME              READY     STATUS    RESTARTS   AGE
  Up|Normal    cassandra-kxa18   1/1       Running   0          1m

Scaling the Cluster

To launch more Cassandra nodes and have them join the cluster, simply scale the Cassandra replication controller:

kubectl scale rc cassandra --replicas=2

A new pod is created...

./cassandra-status.sh                                                                                                                                                                                                                                                                                                                                          

  C* Node      Kubernetes Pod
  -------      --------------
               NAME              READY     STATUS                                                     RESTARTS   AGE
               cassandra-cnvzm   0/1       Image: vyshane/cassandra is ready, container is creating   0          8s
  Up|Normal    cassandra-kxa18   1/1       Running                                                    0          6m

... and it automatically joins the cluster.

./cassandra-status.sh 

  C* Node      Kubernetes Pod
  -------      --------------
               NAME              READY     STATUS    RESTARTS   AGE
  Up|Joining   cassandra-cnvzm   1/1       Running   0          29s
  Up|Normal    cassandra-kxa18   1/1       Running   0          7m
./cassandra-status.sh 

  C* Node      Kubernetes Pod
  -------      --------------
               NAME              READY     STATUS    RESTARTS   AGE
  Up|Normal    cassandra-cnvzm   1/1       Running   0          1m
  Up|Normal    cassandra-kxa18   1/1       Running   0          8m

Connecting to Cassandra

You can connect to Cassandra from any pod in the Kubernetes cluster via the IP address of the Cassandra service. To obtain the IP address:

kubectl describe svc cassandra

If the Kubernetes DNS addon is active, you can also connect to the service through the cassandra hostname.

Configuration Options

The following environment variables can be configured in the Cassandra replication controller definition:

env:
  - name: CASSANDRA_CLUSTER_NAME
    value: Cassandra
  - name: CASSANDRA_DC
    value: DC1
  - name: CASSANDRA_RACK
    value: Kubernetes Cluster
  - name: CASSANDRA_ENDPOINT_SNITCH
    value: GossipingPropertyFileSnitch

Alternatives?

The Kubernetes project has a Cassandra example that uses a custom seed provider for seed discovery. The example makes use of a Cassandra Docker image from gcr.io/google_containers.

Why Did You Create this Project?

I wanted a solution based on the official Cassandra Docker image. My Docker image extends the official Cassandra image with the addition of dnsutils (for the dig command) and a custom entrypoint that configures seed nodes for the container. Seed node IP addresses are provided via DNS by a headless Kubernetes service.

Next Steps