infinyon / fluvio

Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
https://www.fluvio.io/
Apache License 2.0
3.88k stars 491 forks source link

Measure performance of SmartStream consumer for various scenarios #1619

Closed nicholastmosher closed 3 years ago

nicholastmosher commented 3 years ago

We want to investigate long pauses during certain SmartStream consumer scenarios. Specifically the following:

Scenario/Variables:

nicholastmosher commented 3 years ago

Hey team! Please add your planning poker estimate with ZenHub @morenol @nacardin @sehz @simlay @tjtelan

nicholastmosher commented 3 years ago

Testing using the following (included in zip):

https://app.zenhub.com/files/205473061/cf7be662-5362-497f-b4a3-518f24ec2911/download

nicholastmosher commented 3 years ago

Scenario: Local Cluster, CLI (release mode)

Commands used:

To start the local cluster:

cargo build --bin fluvio --bin fluvio-run --release
target/release/fluvio cluster start --local

To produce words to topic:

target/release/fluvio produce consumer-test -f consumer-test/words.txt

To consume words with SmartStream, I made the following script, test-script.sh:

#!/usr/bin/env bash

head -n 61476 <(target/release/fluvio consume consumer-test -B -d --filter=consumer-test/fluvio_wasm_filter.wasm)

The reason I used head here is beccause there is a bug where -d does not work with SmartStream filters (#1504). To get around that I basically use head to read a fixed number of lines, the number of records I know come out of this stream when filtered, which I discovered through, uh, practical application of binary search.

For measurements, I ran hyperfine ./test-script.sh which gave the following output:

$ hyperfine ./test-script.sh
Benchmark #1: ./test-script.sh
  Time (mean ± σ):     379.2 ms ±  11.4 ms    [User: 32.1 ms, System: 58.5 ms]
  Range (min … max):   364.5 ms … 399.8 ms    10 runs
nicholastmosher commented 3 years ago

Scenario: Local Cluster, fluvio-client-wasm (release mode)

Set up cluster:

# In fluvio
$ cargo build --release --bin fluvio --bin fluvio-run
$ target/release/fluvio cluster start --local

Set up websocket proxy

# In fluvio-client-wasm
$ RUST_LOG=debug cargo run --manifest-path ./fluvio-websocket-proxy/Cargo.toml --target $(rustup show | grep 'Default host' | sed 's/Default host: //g')

Set up topic data:

$ fluvio topic create consumer-test
$ fluvio produce consumer-test -f consumer-test/words.txt // From test zip

Run wasm test:

$ wasm-pack test --firefox -- --release

Open firefox and in the console, note the measured test run time

Measurements:

Stats (in ms):

nicholastmosher commented 3 years ago

From these results, I'm seeing that fluvio-client-wasm is certainly slower than the CLI, but not to the extent to cause the problems we were seeing before. I'm re-running the fluvio-client-wasm test in debug mode to see if it exhibits time more similar to the problems we have seen

nicholastmosher commented 3 years ago

Scenario: Same as above, but fluvio-client-wasm is in debug mode:

Measurements:

Stats (in ms):

Notes:

nicholastmosher commented 3 years ago

Closing this, since the scope of this was to measure baseline statistics. I'll link to this in followup issues focused on fixing delays where needed.