hermit-os / hermit-rs

Hermit for Rust.
Apache License 2.0
1.69k stars 86 forks source link

Lost TCP packets #602

Closed CarlWachter closed 4 months ago

CarlWachter commented 4 months ago

Description

When running the netbench benchmark I noticed that TCP packets appear to be "lost" when hermit acts as a server. By this I mean, that when executing it with the same amount of data being sent by the client, as is expected by the server, then the server throws an error that it was expecting more data.

This behaviour only occurs with the configuration: Hermit server, Linux client. If the roles are swapped everything runs as expected. This rules out an issue with the netbench code, in my opinion, since the Linux and Hermit versions run on the same code.

When increasing the amount sent by the client, i.e. server expects 6000 Bytes x 1000, but the client sends 6000 Bytes x 1010, it runs normally (with the client crashing since the server closes before it is finished).

I would expect this to be some sort of input buffer not being read in the kernel, as all packets seem to have been sent successfully. It's also worth noting, that the amount of "extra" data that needs to be sent is always 54240 Bytes, regardless of the expected amount of data, which once again points to some sort of buffer issue.

How to reproduce issue

Build netbench with cargo build --manifest-path benches/netbench/Cargo.toml --bin server-bw --release, then execute:

sleep 10 && cargo run --manifest-path ./Cargo.toml --bin client-bw --release --target x86_64-unknown-linux-gnu -- --nonblocking --address 127.0.0.1 --bytes 6000 --rounds 1000 & sudo qemu-system-x86_64 -display none -serial stdio -kernel hermit-loader-x86_64 -cpu host -enable-kvm -device isa-debug-exit,iobase=0xf4,iosize=0x04 -initrd target/x86_64-unknown-hermit/release/server-bw -smp 1 -m 1024M -netdev user,id=u1,hostfwd=tcp::7878-:7878,hostfwd=udp::9975-:9975,net=192.168.76.0/24,dhcpstart=192.168.76.9 -device virtio-net-pci,netdev=u1,disable-legacy=on,packed=on,mq=on -append '-- --address 10.0.5.3 --bytes 6000 --rounds 1000'

This will lead to:

thread 'main' panicked at benches/netbench/src/rust-tcp-bw/server.rs:24:37:
called `Result::unwrap()` on an `Err` value: Error { kind: UnexpectedEof, message: "failed to fill whole buffer" }