quic-interop / quic-network-simulator

ns3-based network simulator for QUIC testing
Other
156 stars 45 forks source link

Use `tcpdump` to capture pcaps #133

Closed larseggert closed 2 months ago

larseggert commented 3 months ago

Fixes #61

marten-seemann commented 3 months ago

I remember playing around with tcpdump when we started the project, and was seeing missing packets at the end of the transfer. I might have done something wrong back then, or the bug might have been fixed in the 5 years since, so it's still worth trying it out now.

larseggert commented 3 months ago

So this seems to work locally, but we should try CI. The unbuffered write hopefully prevents lost packets at the end, but I can't guarantee it. Hence I thought to use this as a fallback, because it should at least reduce the problem of false negatives when there is no ns3 log at all.

larseggert commented 3 months ago

Running locally, it seems like the ns3 pcaps can not only be empty or correct, but also non-empty but incomplete. So it seems like the tcpdump pcaps are always better to use.

Should I remove the ns3 pcaps entirely?

larseggert commented 3 months ago

@marten-seemann is this waiting on anything?

marten-seemann commented 3 months ago

Running locally, it seems like the ns3 pcaps can not only be empty or correct, but also non-empty but incomplete. So it seems like the tcpdump pcaps are always better to use.

Should I remove the ns3 pcaps entirely?

Sounds like it's worth a try. I suggest building the quic-network-simulator image, pushing it to some Docker registry that you control, and then we can do a run of the interop runner using that image, to test if things will actually work.

larseggert commented 3 months ago

https://hub.docker.com/layers/larseggert/quic-network-simulator/tcpdump/images/sha256-7139255cabc4c86e13464b0936fea7918b826f23d3cba8811303bd2280d29c66?context=explore

larseggert commented 3 months ago

And https://hub.docker.com/layers/larseggert/quic-network-simulator/tcpdump/images/sha256-128176f413185e2c427500144e30e0b065f0d6074ef9a09d6f8739636c81be8f?context=explore for an amd64 one.

nhorman commented 3 months ago

If it helps, I tested locally with this PR, and capture files seem to get saved correctly.

I did make one alteration. In the run.sh script I added this:

trap "kill -SIGUSR2 $PID; sleep 1; kill -SIGTERM $PID" TERM

The tcpdump man page was unclear if the SIGTERM signal forced a flush of the capture buffer, so I added a SIGUSR2 signal, as the man page did indicate that would do so. Unsure if its required or not

Edit: backing out the SIGUSR2 change didn't seem to have an impact, dumps are still created

larseggert commented 2 months ago

@marten-seemann are you still waiting on anything from me?

larseggert commented 2 months ago

We might also be able to drop the promisc bit in the script, but it's not causing harm.