ovanes commented 5 years ago

Documentation states, that it's possible to use drop_result as part of the parsing policy: https://github.com/basiliscos/cpp-bredis#parse_result_titerator-policy

However, it's pretty unclear how to use it with the Connection object.

I read through the source code and Connection seems to be have hard-coded keep_result as policy type. How would I use drop_result with Connection object?

Do I understand the intention of drop_result properly, that it'd cause the parser to verify that the response isn't an error, but the payload itself is dropped.

basiliscos commented 5 years ago

Thank you for your query.

Indeed, it seems the drop_result policy wasn't properly exposed for the API usage.

What is your use-case for discarding the result?

basiliscos commented 5 years ago

PS. Will try to fix it on weekends.

ovanes commented 5 years ago

Thanks for the quick reply! My use case is pretty simple. I'd like to perf test redis in a similar way as redis benchmark, but with more customized control. Obviously, I'd like the client to loose as little time on parsing as possible, i.e. just to verify that the response which came back is «valid» (i.e. it's type is retrievable) and drop the remaining data. After my investigations, I see bredis as perfect accomplishment.

basiliscos commented 5 years ago

Could you, please, try PR https://github.com/basiliscos/cpp-bredis/pull/24 ?

How to use it is basically :


using Policy = r::parsing_policy::drop_result;
...
c.async_read(rx_buff, read_callback, count, Policy{});

the last 2 parameters are mandatory for your purpose. The count could be equal to 1 as in the defaults.

Also, please share your benchmark results.

I'm still not completely convinced that the PR should be merged, but I see another use case for drop_result policy: some background ping/redis keep-alive thread.

ovanes commented 5 years ago

Thanks a lot for an incredible devotion and quick implementation. I gave the implementation a try and here are my thoughts:

Generally, I'd like to skip the entire result's payload but still see what the "high level result" was. This allows to make a conclusion about the response, e.g. nil_t -> miss in case of a get command. Right now it's even not possible to understand if the command type being processed was the correct one. I just tried to use "HGETALL" with a simple sting typed key/value. As a response I received an instance of specialiazied positive_parse_result with a field consumed containing 68. I understand that Redis encoded WRONG TYPE ERROR there, but there is no way to see that the response was an error at all. IMO the response type should be part of the specialized positive_parse_result instance.

basiliscos commented 5 years ago

What you are asking for, is some kind of "partial result drop", meanwhile the drop_result policy discards result completely, i.e. you either do not care about it at all or completely sure about it (like ping).

HGETALL, in your example, returns non-trivial result ( https://redis.io/commands/hgetall ).

So, for your, purposes, I'd suggest you not to extract results, but to scan the existing markers, like that:

template <typename Iterator>
class not_error : public boost::static_visitor<bool> {

  public:

    template <typename T> bool operator()(const T &value) const {
        return true;
    }
    bool operator()(const markers::error_t<Iterator> &value) const {
        return false;
    }
};
...

c.async_read(rx_buff, [&](const auto &error_code, result_t &&r) {
  auto success = boost::apply_visitor(not_error<Iterator>(), r.result);
  if (!success) std::abort();
});

The markers will still be allocated, but they are quite light-weight.

I'll think about possibility to inject custom on-flight parsing policy, but that's surely will be non-trivial.

ovanes commented 5 years ago

Ivan thanks a lot for your explanation.

This is exactly what I was looking for. Over the next few days I'll gather some data, to give you insights on performance benefits (if there are any). I'll post it in this thread here.

Regarding HGETALL: I probably explained it the wrong way. I used this command with a String typed Redis key/value. This was a test to understand how the error_t is reported, when using drop_result policy.

basiliscos commented 5 years ago

@ovanes any news, so far?

I have updated performance testing against Redos, and here are my results:

   bredis (commands/s) | bredis(*) (commands/s) | redox (commands/s)
  ---------------------+------------------------+--------------------
        1.59325e+06    |      2.50826e+06       |    0.999375+06

where (*) are results with drop_result policy.

ovanes commented 5 years ago

Sorry for the delay.

Below are my findings:

Test Setup

All tests were run on an AWS C4.8xlarge instance.
To minimize the network latency Redis and Redis Load Testing Application were run on the same host.
Redis and Redis Load Testing application were run as Docker containers
All Docker containers had assigned fixed number of CPUs and were run in real-time mode to avoid that OS Scheduler assigns the CPU core to another process.
As networking docker's "host networking" was used to avoid any kind of overlay networking latencies
Tests were also compared with Redis Benchmark
For all known Redis tools (redis-benchmark, redis-cli) including Redis server the official docker image v5.0.4 was used: https://hub.docker.com/_/redis
As an operating system Amazon Linux default AMI was used

OS host was provisioned with Redis friendly config:

# echo never > /sys/kernel/mm/transparent_hugepage/enabled
# sysctl vm.overcommit_memory=1

Redis was started with the following configuration:

docker run -d --sysctl net.core.somaxconn=4096 --ulimit nofile=10032:10032 --net=host --cap-add=sys_nice --ulimit rtprio=100  --cpuset-cpus 0,1,2,3 --name=redis-server redis:5.0.4 --bind 127.0.0.1 --port 6379 --protected-mode no --save "" --appendonly no --dir ./ --daemonize no --dbfilename ""

Redis logs indicate no warnings:

[ec2-user@ip-10-0-0-223 ~]$ docker logs -f redis-server
1:C 14 Apr 2019 17:05:35.998 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 14 Apr 2019 17:05:35.998 # Redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 14 Apr 2019 17:05:35.998 # Configuration loaded
1:M 14 Apr 2019 17:05:35.999 * Running mode=standalone, port=6379.
1:M 14 Apr 2019 17:05:35.999 # Server initialized
1:M 14 Apr 2019 17:05:35.999 * Ready to accept connections

All tests were run with coroutine enabled code, but single threaded to allow comparison with Redis Benchmark
All tests were pure GET command tests on a prefilled database with 1M keys and payload size of 200 bytes (0% miss rate).
There were 2 types of tests run:
- without pipelines
- with pipelines (32 commands per pipeline)

Test Results

Notes:

TPS stands for transactions per second
pN stands for latency
Pipeline latency was measured for entire pipeline and not for a single command
Redis Benchmark does not provide fixed latency statistics, thus the latencies matching the table ones were taken where possible
Redis Benchmark in the official docker container v5.0.4 did not provide sub-millisecond latency values.

Redis Benchmark command used for test:


# no-pipelining -> adopt connection count
docker run -ti --rm --net=host --cap-add=sys_nice --ulimit rtprio=100  --cpuset-cpus 4,5 redis:5.0.4 redis-benchmark  -t get -r 1000000 -n 10000000 -h 127.0.0.1 -c 1 -P 1

pipelining -> adopt connection count

docker run -ti --rm --net=host --cap-add=sys_nice --ulimit rtprio=100 --cpuset-cpus 4,5 redis:5.0.4 redis-benchmark -t get -r 1000000 -n 10000000 -h 127.0.0.1 -c 1 -P 2


* Proprietary tool written in C++ and using bredis had similar setup, but in addition customizable parsing policy.

**Actual Results:**
* 1 connections / loopback / 30 seconds / pipeline size: 1

|                      | TPS   | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ----- | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 29568 | 0.00             | 24      | 29       | 32       | 34       | 45       | 52         |             | 135       |
| bredis: header parse | 29144 | -1.43            | 22      | 30       | 33       | 35       | 45       | 53         |             | 129       |
| bredis: full parse   | 28323 | -4.21            | 24      | 30       | 33       | 35       | 46       | 55         |             | 143       |
| redis benchmark      | 35845 | 21.23            |         |          |          |          |          |            |             | 0         |

* 10 connections / loopback / 30 seconds / pipeline size: 1

|                      | TPS    | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ------ | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 97985  | 0.00             | 48      | 96       | 99       | 111      | 126      | 139        |             | 213       |
| bredis: header parse | 93941  | -4.13            | 43      | 101      | 103      | 115      | 129      | 144        |             | 1583      |
| bredis: full parse   | 87556  | -10.64           | 45      | 107      | 110      | 122      | 136      | 150        |             | 253       |
| redis benchmark      | 100055 | 2.11             |         |          |          |          |          |            |             | 0         |

* 50 connections / loopback / 30 seconds / pipeline size: 1 

|                      | TPS    | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ------ | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 93396  | 0.00             | 501     | 512      | 531      | 560      | 588      | 610        |             | 656       |
| bredis: header parse | 92817  | -0.62            | 504     | 515      | 535      | 563      | 589      | 626        |             | 677       |
| bredis: full parse   | 85235  | -8.74            | 544     | 560      | 583      | 613      | 643      | 672        |             | 746       |
| redis benchmark      | 101672 | 8.86             |         |          |          |          |          |            |             | 1000      |

* 100 connections / loopback / 30 seconds / pipeline size: 1

|                      | TPS   | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ----- | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 91197 | 0.00             | 1028    | 1060     | 1092     | 1133     | 1167     | 1197       |             | 1435      |
| bredis: header parse | 87045 | -4.55            | 1069    | 1106     | 1145     | 1192     | 1232     | 1270       |             | 1368      |
| bredis: full parse   | 82596 | -9.43            | 1121    | 1171     | 1206     | 1253     | 1297     | 1329       |             | 1417      |
| redis benchmark      | 99661 | 9.28             |         |          |          |          |          |            | 1000        | 1000      |

* 200 connections / loopback / 30 seconds / pipeline size: 1    

|                      | TPS   | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ----- | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 87715 | 0.00             | 2154    | 2231     | 2274     | 2332     | 2393     | 2507       |             | 2615      |
| bredis: header parse | 85981 | -1.98            | 2181    | 2266     | 2321     | 2390     | 2456     | 2507       |             | 2579      |
| bredis: full parse   | 81932 | -6.59            | 2290    | 2387     | 2437     | 2495     | 2555     | 2647       |             | 2812      |
| redis benchmark      | 95858 | 9.28             |         |          |          |          |          |            | 2000        | 2000      |

* 400 connections / loopback / 30 seconds / pipeline size: 1    

|                      | TPS   | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ----- | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 84914 | 0.00             | 4469    | 4632     | 4700     | 4798     | 4907     | 5070       |             | 6876      |
| bredis: header parse | 83089 | -2.15            | 4518    | 4719     | 4802     | 4917     | 5052     | 5366       |             | 7928      |
| bredis: full parse   | 81146 | -4.44            | 4627    | 4823     | 4918     | 5042     | 5177     | 5470       |             | 7294      |
| redis benchmark      | 94638 | 11.45            | 1000    |          |          |          |          |            | 3000        | 4000      |

* 1 connections / loopback / 30 seconds / pipeline size: 32

|                      | TPS    | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ------ | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 305413 | 0.00             | 88      | 1450     | 1450     | 1450     | 1450     | 1450       |             | 1450      |
| bredis: header parse | 294051 | -3.72            | 90      | 100      | 104      | 115      | 122      | 147        |             | 447       |
| bredis: full parse   | 264363 | -13.44           | 93      | 101      | 105      | 116      | 122      | 148        |             | 209       |
| redis benchmark      | 360880 | 18.16            |         |          |          |          |          |            |             | 0         |

* 10 connections / loopback / 30 seconds / pipeline size: 32

|                      | TPS    | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ------ | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 526654 | 0.00             | 358     | 908      | 908      | 908      | 908      | 908        |             | 908       |
| bredis: header parse | 527584 | 0.18             | 379     | 473      | 599      | 724      | 762      | 818        |             | 948       |
| bredis: full parse   | 524862 | -0.34            | 333     | 413      | 592      | 772      | 810      | 851        |             | 928       |
| redis benchmark      | 519379 | -1.38            |         |          |          |          |          |            | 1000        | 1000      |

* 50 connections / loopback / 30 seconds / pipeline size: 32

|                      | TPS    | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ------ | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 532309 | 0.00             | 1392    | 4611     | 4611     | 4611     | 4611     | 4611       |             | 4611      |
| bredis: header parse | 529194 | -0.59            | 1458    | 2377     | 3017     | 3667     | 3838     | 3929       |             | 4625      |
| bredis: full parse   | 525472 | -1.28            | 1339    | 2188     | 2994     | 3910     | 4141     | 4254       |             | 4964      |
| redis benchmark      | 532453 | 0.03             | 1000    |          |          |          |          |            | 4000        | 4000      |

* 100 connections / loopback / 30 seconds / pipeline size: 32   

|                      | TPS    | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ------ | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 537409 | 0.00             | 3733    | 8504     | 8504     | 8504     | 8504     | 8504       |             | 8504      |
| bredis: header parse | 535626 | -0.33            | 3615    | 4714     | 5950     | 7331     | 7713     | 7869       |             | 8535      |
| bredis: full parse   | 532882 | -0.84            | 2735    | 4340     | 5901     | 7730     | 8198     | 8406       |             | 9325      |
| redis benchmark      | 536394 | -0.19            |         |          | 6000     |          |          |            | 7000        | 8000      |

* 200 connections / loopback / 30 seconds / pipeline size: 32   

|                      | TPS    | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ------ | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 535314 | 0.00             | 8006    | 16723    | 16723    | 16723    | 16723    | 16723      |             | 16723     |
| bredis: header parse | 529886 | -1.01            | 7775    | 9559     | 11874    | 14911    | 15559    | 16032      |             | 17020     |
| bredis: full parse   | 532803 | -0.47            | 6123    | 8648     | 11653    | 15613    | 16464    | 16941      |             | 18322     |
| redis benchmark      | 537634 | 0.43             |         |          | 12000    |          |          |            | 14000       | 19000     |

* 400 connections / loopback / 30 seconds / pipeline size: 32   

|                      | TPS    | Diff to drop [%] | p0 [us] | p10 [us] | p50 [us] | p90 [us] | p99 [us] | p99.9 [us] | p99.99 [us] | p100 [us] |
| -------------------- | ------ | ---------------- | ------- | -------- | -------- | -------- | -------- | ---------- | ----------- | --------- |
| bredis: drop         | 521525 | 0.00             | 17603   | 34790    | 34790    | 34790    | 34790    | 34790      |             | 34790     |
| bredis: header parse | 518396 | -0.60            | 16911   | 19850    | 24502    | 30372    | 32328    | 33816      |             | 36015     |
| bredis: full parse   | 514251 | -1.39            | 14166   | 17718    | 24240    | 32288    | 34516    | 35760      |             | 37683     |
| redis benchmark      | 529913 | 1.61             | 4000    | 21000    | 23000    | 26000    | 27000    | 32000      | 40000       | 40000     |

#### Some Notes

When running the tests without pipelining with low number of connections it is clearly observable, that Redis CPU utilization stays under 90%, which lets the performance and efficiency in the benchmarking tools compete. With higher number of connections or bigger pipelines CPU utilization of Redis reaches 100%. Given that, there is no real competition (or very minimal one) of benchmarking tools but more like which tool is more lucky to get faster response from Redis.

Maybe it'd be a good idea to have a test for performance benchmarking tool, which repeatedly reads the same key. Doing so, it'd put that key into Cache and make Redis serve it in the fastest possible way. Finally, it can be even more advantageous to avoid real TCP Sockets but using Unix Domain Sockets instead which can result in much better throughput and lower latency.

basiliscos commented 5 years ago

@ovanes Thanks a lot for sharing the results. Let's keep that page, as it might be interesting for other people.

I have also a few ideas how to improve performance yet further.

ovanes commented 5 years ago

I put more thoughts into the test result interpretation...

basiliscos commented 5 years ago

Yes, please, go ahead.

In the current implementation, it performs double-parsing, first pass to determine the end of expected reply (i.e .with drop policy), and the 2nd pass to deliver reply to client code.

It also interesting, how you get the numbers for redis benchmark row.

ovanes commented 5 years ago

@basiliscos Unfortunately, I don't fully understand that question:

It also interesting, how you get the numbers for redis benchmark row.

IMO redis-benchmark commands that were used are in the above test description:

# no-pipelining -> adopt connection count
docker run -ti --rm --net=host --cap-add=sys_nice --ulimit rtprio=100  --cpuset-cpus 4,5 redis:5.0.4 redis-benchmark  -t get -r 1000000 -n 10000000 -h 127.0.0.1 -c 1 -P 1

# pipelining -> adopt connection count
docker run -ti --rm --net=host --cap-add=sys_nice --ulimit rtprio=100  --cpuset-cpus 4,5 redis:5.0.4 redis-benchmark  -t get -r 1000000 -n 10000000 -h 127.0.0.1 -c 1 -P 2

Just replace the values in -c and -P parameters for corresponding number of connections and commands in the pipeline. redis-benchmark is a highly optimized tool. I had to do a lot of tweaks to land close comparison (initially my tests were about 30% to 40% less performing). I might have a bit more optimization ideas, but IMO they might improve the TPS by 1% to 2%.

basiliscos / cpp-bredis

Unclear how to use drop_result. #22

Test Setup

Test Results

pipelining -> adopt connection count