refraction-networking / water

WebAssembly Transport Executables Runtime
Apache License 2.0
25 stars 1 forks source link

bug: v0 benchmark sending huge packets up to 32KB #12

Closed gaukas closed 8 months ago

gaukas commented 10 months ago

In #11 benchmarks for Dialer/Listener (v0) have been rewritten, with a lot more test suite implemented to provide a more thorough and detailed test.

However the throughput measured become somehow even better than raw TCP connections created with net.Dial() and net.Listener.Accept().

goos: linux
goarch: amd64
pkg: github.com/gaukas/water/transport/v0
cpu: 12th Gen Intel(R) Core(TM) i5-1240P
BenchmarkDialerOutbound
BenchmarkDialerOutbound-16        931250              1207 ns/op         848.10 MB/s
BenchmarkListenerInbound
BenchmarkListenerInbound-16       806343              1510 ns/op         678.06 MB/s
BenchmarkTCPReference
BenchmarkTCPReference-16          533594              1880 ns/op         544.66 MB/s 

This does not look right. Even if the result is true, we require an explanation for why should WebAssembly-based transport be faster than native TCP sockets.

Additionally, we will require more test suites, plus more sample WATMs in testdata/v0 to demonstrate:

gaukas commented 10 months ago

The problem is v0 benchmark is violating TCP packet segmentation and TCP_NODELAY, which could potentially be a bug.

Behavior observed through pcap shows in the benchmark the WATM in v0 is trying to send huge TCP packets and due to they are in a local network this actually worked. Due to WATMv0 being at least polynomial-time-slower than stdlib, it actually were able to do a much larger read/write (8192B) comparing to the stdpeer, which in effect concatenated more than one packets in a single operation.

gaukas commented 8 months ago

Great news: with the pure Go implementation, this problem is gone as we need no more buffered writes and all writes are directly down to the raw network socket.