facebook / CacheLib

Pluggable in-process caching engine to build and scale high performance services
https://www.cachelib.org
Apache License 2.0
1.18k stars 254 forks source link

Binary request generator/replayer #307

Closed byrnedj closed 1 week ago

byrnedj commented 4 months ago

This is the binary trace replayer/generator that we have been using to achieve max CPU utilization for the kvcache traces in cachebench. With this generator, we can achieve a throughput of over 20 million op/sec using kvcache workload in cachebench. As a comparison, using the CSV replay generator we see only ~1.6 million op/sec due to dynamic allocations and parsing overhead.

We avoid allocations by mmap'ing the request data into memory and using a Request pointer to point to the request data rather than allocating a new request wrapper for each request.

To generate a binary request file from an existing kvcache trace (using the "replay" generator).

  1. Specify the kvcache trace name using the regular traceFileNames or traceFileName option. Specify other properties such as ampFactor too.
  2. In the replayGeneratorConfig, specify binaryFileName: "mybinaryfile.bin as a config option
  3. Run cachebench and wait for the binary file to be generated

To run a binary request trace specify the following:

  1. Set generator to "binary-replay"
  2. Set traceFileName: "mybinaryfile.bin" and set ampSizeFactor (if desired)

In summary - this patch offers much lower overhead of trace replaying. It does assumes the kvcache trace format and kvcache replay generator behavior. Additional features:

The limitations are:

facebook-github-bot commented 2 months ago

@therealgymmy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

therealgymmy commented 1 month ago

@byrnedj: tested it out internally and verified 10x throughput improvement. The binary trace currently does not repeat if the specified operations are longer than the trace lenght, is this intended?

facebook-github-bot commented 1 month ago

@byrnedj has updated the pull request. You must reimport the pull request before landing.

byrnedj commented 1 month ago

I just added that functionality to the latest version.

therealgymmy commented 1 month ago

Thanks let me re-import again.

facebook-github-bot commented 1 month ago

@therealgymmy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 week ago

@therealgymmy merged this pull request in facebook/CacheLib@253107481b6cff7e5d70fc54fce075bca2c463dc.