A utility for benchmarking bleve performance under various configurations and workloads.
The bleve-bench utility works by reading wikipedia articles from a previously generated file with a single article per line. This minimizes time spent reading articles.
The tool performs a number of operations in a given level, and then prints out summary statistics about the performance in this level. For example, with batch size 100 and level size 1000, it will:
The total execution time can be a useful metric, but this is not an attempt to load as many articles as possible as fast as possible. Rather this tool is useful for seeing how the performance changes over time as the number of documents indexed grows.
elapsed,docs,avg_single_doc_ms,avg_batched_doc_ms,query_water_matches,first_query_water_ms,avg_repeated5_query_water_ms
This will download the wikipedia dataset if you don't have it. Then it will build the linefile utility. Then it will run the linefile utility on the wikipedia dataset. NOTE: the download is large and may take a long time (this only happens the first time)
make wikilinefile
Build
go build
To build it with support for some optional C-based storage engines
go build -tags 'leveldb'
Run the benchmark with all defaults:
./bleve-bench
Usage of ./bleve-bench:
-batch=100: batch size
-config="": configuration file to use
-count=100000: total number of documents to process
-cpuprofile="": write cpu profile to file
-level=1000: report level
-memprofile="": write memory profile every level
-qrepeat=5: query repeat
-source="tmp/enwiki.txt": wikipedia line file
-target="bench.bleve": target index filename
Load 100000 articles, with all the defaults.
./bleve-bench
Load 3000 articles using the leveldb backend and dump a cpu-profile at the end.
./bleve-bench -config configs/leveldb.json -count 3000 -cpuprofile=leveldb.profile
Load 3000 articles using the leveldb backend and dump a memory profile after every level.
./bleve-bench -config configs/leveldb.json -count 3000 -memprofile=leveldb-mem.profile
What kind of conclusions can we draw from this utility? Here is a chart produced using this utility to load 100k wikipedia documents into bleve using the LevelDB backend.
This shows several important things:
Here is the same test run with the BoltDB backend.
This too shows several important things: