Run benchmarking test with the aquatic bencher

josecelano commented 6 months ago

Relates to: https://github.com/greatest-ape/aquatic/pull/191

I've added the Torrust Tracker to the Aquatic Bencher.

The Aquatic Bencher is a Rust crate to run load tests for UDP trackers.

Now we can run the same benchmarking test for all these UDP trackers:

aquatic
opentracker
chihaya
torrust-tracker

I'm now running the test on my machine although it's not working for opentracker (I don't know why yet). I will post the final results.

For the time being with 2 cores these are the results:

## Tracker cores: 2 (cpus: 0-1,16-17)
### aquatic_udp run (socket workers: 2) (load test workers: 8, cpus: 8-15,24-31)
- Average responses per second: 712,660
- Average tracker CPU utilization: 191%
- Peak tracker RSS: 195.2 MiB
### aquatic_udp run (socket workers: 2) (load test workers: 12, cpus: 4-15,20-31)
- Average responses per second: 773,195
- Average tracker CPU utilization: 191%
- Peak tracker RSS: 196 MiB
### aquatic_udp (io_uring) run (socket workers: 2) (load test workers: 8, cpus: 8-15,24-31)
- Average responses per second: 779,341
- Average tracker CPU utilization: 191%
- Peak tracker RSS: 238.5 MiB
### aquatic_udp (io_uring) run (socket workers: 2) (load test workers: 12, cpus: 4-15,20-31)
- Average responses per second: 772,850
- Average tracker CPU utilization: 190%
- Peak tracker RSS: 236.7 MiB
### opentracker run (workers: 2) (load test workers: 8, cpus: 8-15,24-31)
- Average responses per second: 0
- Average tracker CPU utilization: 0%
- Peak tracker RSS: 1.9 MiB
### opentracker run (workers: 2) (load test workers: 12, cpus: 4-15,20-31)
- Average responses per second: 0
- Average tracker CPU utilization: 0%
- Peak tracker RSS: 1.9 MiB
### chihaya run () (load test workers: 8, cpus: 8-15,24-31)
- Average responses per second: 189,764
- Average tracker CPU utilization: 369%
- Peak tracker RSS: 8.4 GiB
### chihaya run () (load test workers: 12, cpus: 4-15,20-31)
- Average responses per second: 192,124
- Average tracker CPU utilization: 380%
- Peak tracker RSS: 7.1 GiB
### torrust-tracker run () (load test workers: 8, cpus: 8-15,24-31)
- Average responses per second: 424,511
- Average tracker CPU utilization: 375%
- Peak tracker RSS: 188.7 MiB
### torrust-tracker run () (load test workers: 12, cpus: 4-15,20-31)
- Average responses per second: 416,752
- Average tracker CPU utilization: 375%
- Peak tracker RSS: 188.9 MiB

With: load test workers: 8, cpus: 8-15,24-31:

tracker	average responses per second
aquatic (io_uring)	779,341
aquatic	712,660
torrust-tracker	424,511
chihaya	189,764

With: load test workers: 12, cpus: 4-15,20-31:

tracker	average responses per second
aquatic (io_uring)	772,850
aquatic	773,195
torrust-tracker	416,752
chihaya	192,124

@da2ce7 @mickvandijke I'm not using the DashMap repository implementation.

josecelano commented 6 months ago

The result here with 2 cores is a little bit worse than mine.

greatest-ape commented 6 months ago

I think the virtual server I was using for that benchmark has quite a bit lower per-core performance than a good recent desktop CPU, which could explain the discrepancy. The general pattern seems to hold, though, with e.g. aquatic performing around 4 times better than chihaya with 2 cores.

josecelano commented 6 months ago

I'm getting some errors trying to run a complete test:

2024-03-19-bencher.txt.

For example:

panic: too many concurrent operations on a single file or socket (max 1048575)

Couldn't send packet: Os { code: 111, kind: ConnectionRefused, message: "Connection refused" }

greatest-ape commented 6 months ago

Yeah, chihaya tends to crash under heavy load. There is an issue opened about it somewhere, probably in the chihaya or Go repos. In my recollection it is amazingly an actual Go runtime limitation that they didn’t want to fix.

Another thing of interest here is that the default CpuMode gives an unfair advantage to chihaya and torrust since they open one worker per thread, while the benchmark config for aquatic and opentracker opens one per core. You can see this by looking at the average CPU utilization stats for the lower core counts. This could be solved by adding entries to test with double the current worker count too for aquatic/opentracker.

The reason why I haven’t yet is that the current setup is meant to enable somewhat fair testing on virtual machines where hyperthreads don’t really correspond to real hyperthreads, but for that to work, SubsequentOnePerPair mode must be used.

josecelano commented 6 months ago

Yeah, chihaya tends to crash under heavy load. There is an issue opened about it somewhere, probably in the chihaya or Go repos. In my recollection it is amazingly an actual Go runtime limitation that they didn’t want to fix.

Another thing of interest here is that the default CpuMode gives an unfair advantage to chihaya and torrust since they open one worker per thread, while the benchmark config for aquatic and opentracker opens one per core. You can see this by looking at the average CPU utilization stats for the lower core counts. This could be solved by adding entries to test with double the current worker count too for aquatic/opentracker.

The reason why I haven’t yet is that the current setup is meant to enable somewhat fair testing on virtual machines where hyperthreads don’t really correspond to real hyperthreads, but for that to work, SubsequentOnePerPair mode must be used.

Hi @greatest-ape, thank you for your feedback. Does that mean that ideally the test should be run on a virtual machine to be accurate?

greatest-ape commented 6 months ago

In its current exact state, yes, but I can make some minor modifications to make it work well locally too :-)

greatest-ape commented 5 months ago

I’ve done those changes now.

josecelano commented 1 month ago

We have to update the configuration in https://github.com/greatest-ape/aquatic/blob/master/crates/bencher/src/protocols/udp.rs#L448-L508

THe env var name and the configuration file have changed.

josecelano commented 1 month ago

I've opened a PR on the Aquatic repo to update the tracker config in the Bencher.

josecelano commented 1 month ago

PR merged.

torrust / torrust-tracker

Run benchmarking test with the aquatic bencher #735