tempesta-tech / tempesta-test

Test suite for Tempesta FW
10 stars 4 forks source link

Multy-threaded Deproxy #107

Open vankoven opened 5 years ago

vankoven commented 5 years ago

The Deproxy is fully written in Python and has a poor performance. As far as i remember, 20-30RPS was its maximum (while wrk+nginx chain can produce 10-12 KRPS). It's not a load generator, but it validates received messages very strictly. But the problem is that we have to develop two types of tests: deproxy-tests with one single message chain and workload tests using wrk. These two kind of tests are completely different.

Issues we have faced with Deproxy during pat year:

Issues in wrk tests:

As for me, we should rework the deproxy to make it possible to serve at least 5-10KRPS to test all the possible multy-threaded effects in Tempesta. The tool MUST do full Http message validation as current deproxy does. It's not very hard to do quite fast: wrk already has full (but simple) http parser inside, so does nginx, while the validation doesn't require advanced http parsers, pattern matching is works in the deproxy for a long time. Each deproxy client and server must be a separate process, that gets a script as input (like wrk), writes all the sent and received messages in it's log file, make some assertions and returns eror code (0=Test passed, not-NULL=failed). The test framework must analyse only that returns codes.

That will allow to build both functional and workload code with one piece of code, only difference - is the concurrent connections parameter.

TfwBomber was the tool intended to replace the wrk and give more abilities for fuzzing, but it doesn't care about server side. It reuses a lot of Tempesta code. It can be useful.

krizhanovsky commented 5 years ago

Is it possible to get x300 (from 30RPS to 10KRPS) performance improvement just moving to multiprocess scheme? Probably, if we really need to combine deproxy flexibility with wrk performance, it's better to move to TfwBomber and develop it's server side or, even better, just make https://github.com/tempesta-tech/tempesta/issues/471 .

The second concern is about particular multiprocess model. Probably we won't get much more performance just from 1 client and 1 server processes in deproxy. wrk uses many threads to load a server, so we should do the same - spawn many client and, correspondingly server, processes. Since massive process management can be expensive, deproxy in this model must have a setting to specify how many processes to spawn in each particular case.

All in all, I'm not against the multiprocess deproxy, but it it's doubtful that we can reach good performance with small effort and if significant effort is required, then why not to develop Tempesta native features (TfwBomber + server mode) instead?

vankoven commented 5 years ago

Is it possible to get x300 (from 30RPS to 10KRPS) performance improvement just moving to multiprocess scheme?

Of course, no.

if we really need to combine deproxy flexibility with wrk performance, it's better to move to TfwBomber and develop it's server side

I'm convinced that we had to have a tool that can generate severe traffic load and in the same time provide deep data consistence checks. It's crucial for some functions, that preemption is disabled, AVX registers are saved and restored, no memory access errors happens when multiple messages are split or combined, in place crypto operations, zero-copy modifications, etc. The only way to be 100% sure that Tempesta works great and doesn't garbage passed data - is to test it under some load. Wrk-based tests are great, but they claims only two things about message integrity: http messages are correctly framed (but not necessarily that all the headers are framed correctly!), and response code is 200. Too generic checks as for me. And we have no integrity checks on server side, nginx may receive broken messages and close the connection, such situations won't be treated as error.

I think i've lead you in misunderstanding by the issue name as Multy-threaded Deproxy. I expect to see a tool very similar to wrk with it's lua scripting capabilities for both client and server sides, but with all the disadvantages solved. Just couldn't find a better name for the issue.

Probably we won't get much more performance just from 1 client and 1 server processes in deproxy

We wont. We have configuration option in the tests configuration file concurrent_connections. We should add concurrent_clients option there. For developers we will use really low values: one client, one connection. In this case test behaviour will be like in current functional test: single request-response sequence for a single test. On the CI behaviour will be different: a lot of client processes would be spanned, with multiple concurrent connections, each connection will try it's own request-response sequence, it will be a lot of clients making same requests.

With that approach in mind, all the tests will use the same tooling, and the each test will describe a list of requests and responses with a couple of lines of code for each message.

why not to develop Tempesta native features (TfwBomber + server mode) instead?

I'm not against it, even more, I tried to say that current tools has disadvantages and doesn't make me 100% convinced that Tempesta has no issues if all the tests are passed. And something more special is required. I still call it deproxy, since it's job is to validate proxy server behaviour. Would it be better to patch wrk or finish TfwBomber or do something else - I don't know.

krizhanovsky commented 5 years ago

We definitely must not throw out current wrk client and Nginx backend in sake of diversity for interoperability testing, see #111.

krizhanovsky commented 5 years ago

From https://github.com/tempesta-tech/tempesta-test/pull/96/files#diff-ccd1b8962bc69bb1058314347d328c25R163 :

Mock Client

Separate process, started by the test framework. Has following arguments:

Client has two structures to contain required information: client_data for global client data and conn_data for per connection data.

First, client waits for POSIX barrier until all other mock servers are ready. Then for every connection it does the following algorithm:

  1. Open a new connection and bind it to interface. Set conn_data.request_id to 0.

  2. Call gen_request_func() from the script to generate the request.

  3. Copy generated request to send buffer. Push request to conn_data.request_queue and increment conn_data.request_id.

  4. If not conn_data.pipeline go to step 5, otherwise go to step 2.

  5. Send current buffer to Tempesta. Start conn_data.req_timeout timer.

  6. Receive as many as possible. If conn_data.req_timeout is expired go to err

  7. Try to parse response from the buffer. If full response is received go to 8, else go to 6.

  8. For complete response run response_cb() from the script. If error happen go to err. Remove request from conn_data.request_queue If there are request unreplied, go to 6. Else stop conn_data.req_timeout timer.

  9. If there are unsent requests go to 2.

  10. Close connection. If conn_data.repeats go to 1.

  11. Exit.

err. Build error message, close connection and exit.

See #116 for gen_request_func() and response_cb().

Mock Server

Follows the same concept as mock client. Arguments:

Server is stopped by TERM signal from the framework.

Script contains queue of expected requests server_data.requests[]. If all requests was received, but a new request is received, then client is working in repeated mode, thus conn_data.expected_request_num must be reset to 0.