Closed m-stein closed 5 months ago
@chelmuth I've tried to test this one as good as possible and would recommend it for the release: 28024b7203 Revert "nic_router: incremental L4 checksum updates"
@m-stein To answer our open question from yesterday: Linux TCP/IP statistics can be accessed by the follwing command lines.
netstat --statistics
nstat -sz
nstat -sz |& grep -i err
@m-stein do you see any chance to address this issue after the Sculpt 24.04 release (or at least update with your current findings)?
My plan is to finish the work at the File Vault next week. Should I prioritize this issue next?
Let's discuss the list of open issues offline for prioritization.
If you tackle this issue I'd appreciate a unit/stand-alone test of the implementation, which should be not much effort. Wireshark/pcap may be handy.
I played around with tshark, tcpdump, trafgen, randpkt and the trace_recorder_pcapng test this morning. I managed to write a short test that detects bad checksums captured in a nic_router scenario and learned in principle how to generate random but valid traffic. Now I'm a bit uncertain in which direction to proceed. One way would be to set up a real-world scenario (with nic_router) that generates diverse but reproducible traffic that we capture. Alternatively, we could generate random, valid packets with lx tooling and write a test component that updates the packet checksums both completely and incrementally with our algorithms (no nic_router). I assume that reading pcap into session packets shouldn't be too complicated. Any opinions?
In my opinion, a unit test of the algorithms suffices as this was missing in the past and let to a broken component. The issue was unnoticed for a while and only discovered in @skalk's TFTP redirection scenario. My intention is to avoid repeating this sloppiness.
@chelmuth Commit d96632ab29 implements the unit test.
@m-stein the run script reproducibly fails for SEED=2812065779
and SEED=991351508
like follows (current staging).
Error: number of checksums in bin/internet_checksum_dir/output.pcap (20001) differs from number of checked checksums (20000)
bin/internet_checksum_dir/tmp contains the following line (which is missing with other seeds).
[Checksum Status: Unverified]
The culprit seems to be the following packet in output.pcap.
Stream Control Transmission Protocol, Src Port: 22000 (22000), Dst Port: 38895 (38895)
Source port: 22000
Destination port: 38895
Verification tag: 0x37b485bf
Checksum: 0x838f40f9 [unverified]
[Checksum Status: Unverified]
SCTP?
tshark -o ip.check_checksum:TRUE -o tcp.check_checksum:TRUE -o udp.check_checksum:TRUE -O ip,udp,tcp,icmp -r bin/input.pcap
skips all non-listed protocols and works for the seeds above. @m-stein does this reflect your intentions?
@chelmuth This looks good to me! Thanks a lot! I also tried to better tailor the output but could not find a way.
What is the status of this issue?
The internet_checksum test attests that the current state works as intended. Currently, there are no concrete plans to revive incremental checksum update.
So, may I close it?
@skalk reported that he cannot fetch a file via TFTP within uboot on our iMX8 test board if the file is provided by his Sculpt 22.10 Debian 11 VM. However, the fetch works fine if the server is on a non-Sculpt machine or a machine with an older Sculpt version. We found that the UDP checksum of the initial TFTP request is broken when it arrives in the VM on Sculpt 22.10 while on Sculpt 22.04 the checksum is fine.
This leads to the conclusion that https://github.com/genodelabs/genode/commit/9a37ccfe291508c5199e10813117b16a5766ff44, although correct for most cases, has a corner case that is broken and that is triggered in this exact TFTP scenario.