Cassandra is a partitioned row store. Rows are organized into tables with a required primary key.
http://wiki.apache.org/cassandra/Partitioners[Partitioning] means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster.
http://wiki.apache.org/cassandra/DataModel[Row store] means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
For more information, see http://cassandra.apache.org/[the Apache Cassandra web site].
. Java >= 1.7 (OpenJDK and Oracle JVMS have been tested) . Python 2.7 (for cqlsh)
This short guide will walk you through getting a basic one node cluster up and running, and demonstrate some simple reads and writes.
First, we'll unpack our archive:
$ tar -zxvf apache-cassandra-$VERSION.tar.gz $ cd apache-cassandra-$VERSION
After that we start the server. Running the startup script with the -f argument will cause Cassandra to remain in the foreground and log to standard out; it can be stopped with ctrl-C.
$ bin/cassandra -f
Note for Windows users: to install Cassandra as a service, download http://commons.apache.org/daemon/procrun.html[Procrun], set the PRUNSRV environment variable to the full path of prunsrv (e.g., C:\procrun\prunsrv.exe), and run "bin\cassandra.bat install". Similarly, "uninstall" will remove the service.
Now let's try to read and write some data using the Cassandra Query Language:
$ bin/cqlsh
The command line client is interactive so if everything worked you should be sitting in front of a prompt:
As the banner says, you can use 'help;' or '?' to see what CQL has to offer, and 'quit;' or 'exit;' when you've had enough fun. But lets try something slightly more interesting:
If your session looks similar to what's above, congrats, your single node cluster is operational!
For more on what commands are supported by CQL, see https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile[the CQL reference]. A reasonable way to think of it is as, "SQL minus joins and subqueries, plus collections."
Wondering where to go from here?