mgp / iron-cushion

A benchmark and load test for CouchDB
MIT License
44 stars 8 forks source link

Iron Cushion logo

Iron Cushion is a benchmark and load testing tool for CouchDB, developed by Ad Hoc Labs, Inc. It proceeds in two steps: First, documents are bulk inserted using CouchDB's Bulk Document API. Second, documents are individually created, read, updated, and deleted with random ordering of operations using CouchDB's Document API. Below we refer to the former as the "bulk insert step," and the latter as the "CRUD operations step." Statistics for both steps are recorded separately and displayed afterward.

It is written in Java for version 5.0 and higher, depends only on the Netty library, and is released under the MIT license.

Command Line Flags

Either json_document_schema_filename or xml_document_schema_filename must be provided. For details on the contents of these files, see "Document Generation" below.

Bulk Insert Flags

The following flags control the bulk insert step:

For example, if num_connections is 50, num_documents_per_bulk_insert is 1000, and num_bulk_insert_operations is 20, then after the bulk insert step there will be 50 x 1,000 x 20 = 1,000,000 documents in the database.

CRUD Flags

The following flags control the CRUD operations step:

For example, if create_weight is 2, read_weight is 3, update_weight is 2, and delete_weight is 1, then 2/8 of all CRUD operations will be create operations, 3/8 of all CRUD operations will be read operations, 2/8 of all CRUD operations will be update operations, and 1/8 of all CRUD operations will be delete operations. If num_crud_operations is 10000, this equals 2,500 create operations, 3,750 read operations, 2,500 update operations, and 1,250 delete operations per connection.

Every update or delete operation requires the _rev value of a document. Such a value comes from either reading the document from the database earlier, or from creating the document earlier and recording the returned value. Therefore the sum create_weight + read_weight must be greater than or equal to delete_weight. Additionally, if update_weight is greater than 0, then create_weight + read_weight must be greater than 0. If these inequalities don't hold, the flags fail validation. Finally, if delete_weight is large enough such that the number of documents to be deleted exceeds the sum of number of documents bulk inserted and the number of documents created from CRUD operations, the flags fail validation. To remedy this, bulk insert more documents, increase create_weight, or decrease delete_weight.

Document Generation

Note that while CouchDB is schemaless, Iron Cushion requires a schema to serve as a template for generated documents that are inserted during the bulk insert step, or inserted or updated during the CRUD operations step. This allows the user to easily control their level of complexity. A schema can be defined either using JSON or XML, but you will likely find the former easier.

Subject to the quality of the pseudo-random number generator, generated values adhere to the following rules:

JSON

The json_document_schema_filename command line flag specifies a file containing JSON that defines a schema for documents in the database.

A new document is generated from the schema by the following rules:

All other properties of the JSON, such as array lengths, value names in objects, null values, and nested types are preserved. The advantage of using a JSON schema is that any document stored in CouchDB is a valid schema for Iron Cushion. Iron Cushion will automatically remove the special _id and _rev values from the JSON schema provided.

An example can be found in iron-cushion/iron-cushion/data/example_schema.json. Its schema is equivalent to the one in iron-cushion/iron-cushion/data/example_schema.xml.

XML

The xml_document_schema_filename command line flag specifies an file containing XML that defines a schema for documents in the database. In the future, the XML file may allow adding attributes to these tags to specify properties like minimum and maximum values for generated integers, etc.

There are seven principal tags, and the outer-most tag must be <object>:

An example can be found in iron-cushion/iron-cushion/data/example_schema.xml. Its schema is equivalent to the one in iron-cushion/iron-cushion/data/example_schema.json.

Document Updates

To update to a document during the CRUD operations step, Iron Cushion regenerates and replaces a randomly chosen value from the document's top level object. The updated document is sent to CouchDB using a PUT request.

Example

The following schema, found in file iron-cushion/iron-cushion/data/example_schema.json:

{
    "array1": [
        [
            0.0, 
            0.0
        ], 
        {}, 
        true,
        null
    ], 
    "boolean1": true, 
    "integer1": 0, 
    "obj1": {
        "array2": [], 
        "obj2": {
            "boolean2": true
        }
    }, 
    "string1": ""
}

Can generate the following document:

{
  "array1": [
    [
      0.8474894, 
      0.30425853
    ], 
    {}, 
    false,
    null
  ], 
  "boolean1": true, 
  "integer1": 1929847379, 
  "obj1": {
    "array2": [], 
    "obj2": {
      "boolean2": false
    }
  }, 
  "string1": "928lR8eM7DcBSgR 598A8VxzeFE2 uKTF FqiMEmxdLJmDni"
}

Running the Benchmark

If you don't want to go through the hassle of running javac yourself, simply copy the files iron-cushion/iron-cushion/dist/IronCushion-0.1.jar and iron-cushion/iron-cushion/lib/netty-3.3.1.Final.jar to a directory outside of the iron-cushion project. Pull up the command line, cd into that directory, and run the following:

java -cp IronCushion-0.1.jar:netty-3.3.1.Final.jar co.adhoclabs.ironcushion.Benchmark [flags]

Where [flags] is replaced with the Iron Cushion command line flags of your choosing.

Understanding the Results

The following flags specify using 100 connections, collectively bulk inserting 2,000,000 documents, followed by performing 20,000 create operations, 20,000 read operations, 30,000 update operations, and 30,000 delete operations.

--num_connections=100
--num_documents_per_bulk_insert=1000
--num_bulk_insert_operations=20
--num_crud_operations=1000
--create_weight=2
--read_weight=2
--update_weight=3
--delete_weight=3

Running the benchmark program with these flags on my 1.83 GHz Intel Core Duo MacBook, CouchDB on my Intel Core 2 2.83GHz quad-core desktop, and across my 100Mbit home LAN, I get the results below.

Bulk Insert Results

BULK INSERT BENCHMARK RESULTS:
  timeTaken=249.182 secs
  totalJsonBytesSent=374,240,177 bytes
  totalJsonBytesReceived=138,823,936 bytes
  localProcessing={min=1.363 secs, max=2.906 secs, median=1.800 secs, sd=0.323 secs}
  sendData={min=9.066 secs, max=29.611 secs, median=19.002 secs, sd=4.287 secs}
  remoteProcessing={min=171.507 secs, max=214.598 secs, median=203.845 secs, sd=10.918 secs}
  receiveData={min=4.933 secs, max=23.565 secs, median=11.875 secs, sd=3.856 secs}
  remoteProcessingRate=10,003.030 docs/sec
  localInsertRate=8,676.693 docs/sec

CRUD Results

CRUD BENCHMARK RESULTS:
  timeTaken=84.654 secs
  totalJsonBytesSent=10,646,425 bytes
  totalJsonBytesReceived=10,704,882 bytes
  localProcessing={min=0.002 secs, max=0.062 secs, median=0.016 secs, sd=0.010 secs}
  sendData={min=0.000 secs, max=0.035 secs, median=0.002 secs, sd=0.005 secs}
  remoteCreateProcessing={min=20.070 secs, max=22.376 secs, median=21.113 secs, sd=0.464 secs}
  remoteReadProcessing={min=1.816 secs, max=2.566 secs, median=2.236 secs, sd=0.135 secs}
  remoteUpdateProcessing={min=29.215 secs, max=31.504 secs, median=30.425 secs, sd=0.520 secs}
  remoteDeleteProcessing={min=29.357 secs, max=31.609 secs, median=30.602 secs, sd=0.503 secs}
  remoteCreateProcessingRate=949.141 docs/sec
  remoteReadProcessingRate=9,015.862 docs/sec
  remoteUpdateProcessingRate=980.172 docs/sec
  remoteDeleteProcessingRate=980.154 docs/sec