Questions about trace files when running cachebench

rainjuns commented 4 months ago

Hello, thank you for managing the great project!

I found that cachelib provides several traces in here and I have two questions in testing them using cachebench.

Do trace files include set operations caused by misses onget operations?
- I found that CacheLib has options (enableLookAside) for performing set operations over misses on get operations.
- However, I wonder if such behaviors are already captured in trace files.
If there are a lot of get operations before set operations, which can be captured in trace files, is a miss ratio in CacheLib still accurate?
- Depending on how trace files are collected, only get operations can be captured and they will cause a lot of misses, increasing a miss ratio.
- In addition, if enableLookAside is turned on, many set operations will be generated for the same key-value.
- In production level, get operations for the same key might be queued while waiting the response from the first miss trigger.
- Please refer to the following trace lines (the first file of kvcache/202206), key: 1665497896 will generate a lot of misses:
```
key,op,size,op_count,key_size
1668757755,SET,82,1,40
1668757755,GET,0,1,40
1668757805,SET,208,1,63
1668757805,GET,0,1,63
1665498006,GET,104,2,64
1666258101,GET,81,2,23
1665497896,GET,169,18,78
1665702915,SET,109,1,40
1665702915,GET,0,1,40
1665497896,GET,169,18,78
```
For requests with the same key, what is difference between (1) a trace line with op_count larger than 1 and (2) multiple trace lines with op_count=1?

therealgymmy commented 4 months ago

1). Yes the traces include "SET" which are triggered due to misses to get operations in our systems. There're some exceptions in KV traces. Notably some clients do "SET" first and then "GET" (after some minutes or hours). These clients are basically prefetching data. They're rare in the traces compared to the regular cache set-after-a-miss workloads.

2). enableLookAside should only be used when you filter out all the "SET" operations from the original trace. This is useful when you have a cache size drastically different from the cache config, as it will enable CacheBench to behave like an actual cache instead of just replaying the original set traces. (E.g. original hit rate at 90% would have much fewer sets compared to a smaller cache at 50% but receiving the same GET workload).

3). op_count = the number of requests we have seen for this key in this "second" when we collected the traces originally. Each row in our trace represents a second worth of requests per key per operation.

rainjuns commented 4 months ago

@therealgymmy Thank you for the response. I appreciate it!

facebook / CacheLib

Questions about trace files when running cachebench #306