How can you measure how good your MongoDB (or other databases with similar interface) performance is? Easy, you can benchmark it. A general way to solve this problem is to use a benchmark tool to generate queries with random contents under certain random distribution.
But sometimes you are not satisfied with the randomly generated queries, since you're not confident in how much these queries resemble your real workload.
The difficulty compounds when one MongoDB instance may host completely different types of databases that each have their own unique and complicated access patterns.
That is the reason we came up with Flashback
, a MongoDB benchmark framework that allows us to benchmark with "real" queries. It is comprised of a set of scripts that fall into the 2 categories:
The two parts are not tied to each other and can be used independently for different purposes.
How do you know which ops are performed by MongoDB? There are a lot of ways to do this. But in Flashback, we record the ops by enabling MongoDB's profiling.
By setting the profile level to 2 (profile all ops), we'll be able to fetch the ops information detailed enough for future replay -- except for insert ops.
MongoDB does not log insertion details in the profile DB. However, if a MongoDB instance is working in a "replica set", we can capture insert information by reading the oplog.
Thus, we record the ops with the following steps:
With the ops being recorded, we also have a replayer to replay them in different ways:
The replay module is written in Go because Python doesn't do a good job in concurrent CPU intensive tasks.
cp config.py.example config.py
.config.py
, modify it based on your need. Here are some notes:
duration_secs
indicates the length for the recording.After configuration, please simply run python record.py
.
$ go get github.com/ParsePlatform/flashback/cmd/flashback
Required options:
flashback \
--style=[real|stress] \
--ops_filename=<file_name> \ # Operations file, such as generated by the Record tool
To use a specific host/port and/or to use authentication, specify a mongodb:// url:
flashback \
--url=mongodb://myuser:mypass@mongodb01.example.com:27017
...
For a full list of options:
flashback --help
pcap_converter is an experimental way to build a recorded ops file from a pcap of mongo traffic.
Note: 'getmore' operations are not yet supported by pcap_converter
$ go get github.com/ParsePlatform/flashback/cmd/pcap_converter
$ tcpdump -i lo0 -w some_mongo_cap.pcap 'tcp and dst port 27017'
$ pcap_converter -f some_mongo_cap.pcap -o ops_filename.bson