appneta / tcpreplay

Pcap editing and replay tools for *NIX and Windows - Users please download source from
http://tcpreplay.appneta.com/wiki/installation.html#downloads
1.16k stars 268 forks source link

Test suite "bus error" on armhf #725

Closed cbiedl closed 1 year ago

cbiedl commented 2 years ago

This is what blocks tcpreplay to migrate to Debian testing, see https://bugs.debian.org/1009199

At first, sorry for not bringing this here earlier. I recall I already had analyzed the situation but as there is no bug report here, I must assume I never sent it, too bad.

So, this is the story: On armhf (but not armel), the tcpreplay test suite fails with various bus errors, for example:

[tcprewrite] Seed IP test: /bin/bash: line 1: 767660 Bus error               ../src/tcprewrite -i ./test.pcap -o test.rewrite_seed1 -s 55 >> test.log 2>&1

Bisecting led to

af0d523122ebdfb459eec9b9e1dd7dfedddc82cb is the first bad commit
commit af0d523122ebdfb459eec9b9e1dd7dfedddc82cb
Author: Fred Klassen <fklassen@appneta.com>
Date:   Thu Jan 27 14:26:32 2022 -0800

    Bug #695 remove FORCE_ALIGN

... but I reckon simply reverting that one is not a sane way to deal with it. Perhaps in Debian, but certainly not upstream-wise. So I'm asking you to find a generic solution for this, for example by enhancing the check in autoconf so it will never trigger on MacOS?

cbiedl commented 2 years ago

Hmwait, that was actually reverted later (in 084ec86), but the test does not do the right thing on armhf (config.log):

configure:25049: checking for requires strict byte alignment
configure:25117: result: no
cbiedl commented 2 years ago

After (re-)adding arm* to the list of archs in l.1720 of configure.ac, the "bus errors" no longer appear. There is however one test still failing:

[tcprewrite] L7 fuzzing test: make[1]: *** [Makefile:1062: rewrite_l7fuzzing] Error 255

Test log shows:

Fatal Error: Error rewriting packets: From edit_packet.c:fix_ipv4_checksums() line 74:
Invalid packet: Expected IPv4 packet: got 0: pkt=9
ksum TCP with insufficient L4 data

If you have an idea about this, I'm happy to experiment more - I failed to find the regression as unfortunately there are many commits since the 4.3.4 release that fail to build. Error message is:

In file included from ./tcpedit_stub.h:27,
                 from plugins/dlt_en10mb/en10mb.c:28:
plugins/dlt_en10mb/en10mb.c: In function ‘dlt_en10mb_parse_opts’:
./tcpedit_stub.h:87:52: error: ‘INDEX_OPT_ENET_VLAN_PROTO’ undeclared (first use in this function); did you me
an ‘INDEX_OPT_ENET_VLAN_PRI’?
   87 | #define         DESC(n) (tcpedit_tcpedit_optDesc_p[INDEX_OPT_## n])
      |                                                    ^~~~~~~~~~
./tcpedit_stub.h:89:41: note: in expansion of macro ‘DESC’
   89 | #define     HAVE_OPT(n) (! UNUSED_OPT(& DESC(n)))
      |                                         ^~~~
plugins/dlt_en10mb/en10mb.c:379:17: note: in expansion of macro ‘HAVE_OPT’
  379 |             if (HAVE_OPT(ENET_VLAN_PROTO)) {
      |                 ^~~~~~~~
fklassen commented 1 year ago

I get my armhf image from here and cannot reproduce it.

My approach will be to restore FORCE_ALIGN for non-macOS arm and get fuzzing working on armhf.

root@debian-armhf:/home/user/src/tcpreplay-4.4.1# uname -a
Linux debian-armhf 3.2.0-4-vexpress #1 SMP Debian 3.2.51-1 armv7l GNU/Linux
root@debian-armhf:/home/user/src/tcpreplay-4.4.1# make test
echo Making test in ./test
Making test in ./test
cd ./test && make test
make[1]: Entering directory `/home/user/src/tcpreplay-4.4.1/test'
NOTICE: Tests must be run as root
Sending traffic on 'eth0' and 'eth0'
[tcpprep] Auto/Router mode test:        OK
[tcpprep] Auto/Bridge mode test:        OK
[tcpprep] Auto/Client mode test:        OK
[tcpprep] Auto/Server mode test:        OK
[tcpprep] Auto/First mode test:         OK
[tcpprep] CIDR mode test:           OK
[tcpprep] Regex mode test:          OK
[tcpprep] Port mode test:           OK
[tcpprep] MAC mode test:            OK
[tcpprep] Comment mode test:            OK
[tcpprep] Print info mode test:         OK
[tcpprep] Print comment mode test:      OK
[tcpprep] Config mode test:             OK
[tcpprep] MAC reverse mode test:        OK
[tcpprep] CIDR reverse mode test:       OK
[tcpprep] Regex reverse mode test:      OK
[tcpprep] exclude packets test:         OK
[tcpprep] include packets test:         OK
[tcpprep] include source test:          OK
[tcpprep] include destination test:         OK
[tcpreplay] Basic test:             OK
[tcpreplay] Cache test:             OK
[tcpreplay] Packets/sec test:           OK
[tcpreplay] Mbps test:              OK
[tcpreplay] Topspeed test:          OK
[tcpreplay] Config file/VLAN add test:      OK
[tcpreplay] Multiplier test:            OK
[tcpreplay] Packets/sec Multiplier test:    OK
[tcpreplay] Precache test:          OK
[tcpreplay] Statistics test:            OK
[tcpreplay] Dual file test:             OK
[tcpreplay] Maximum sleep test:         OK
[tcprewrite] Portmap test:          OK
[tcprewrite] Portmap range test:        OK
[tcprewrite] Endpoint test:             OK
[tcprewrite] Pseudo NAT test:           OK
[tcprewrite] Truncate test:             OK
[tcprewrite] Pad test:              OK
[tcprewrite] Seed IP test:          OK
[tcprewrite] Src/Dst MAC test:          OK
[tcprewrite] Layer2 test:           OK
[tcprewrite] Config/VLAN Add test:      OK
[tcprewrite] Skip bcast test:           OK
[tcprewrite] DLT User test:             OK
[tcprewrite] DLT Cisco HDLC test:       OK
[tcprewrite] VLAN 802.1ad test:         OK
[tcprewrite] VLAN Delete test:          OK
[tcprewrite] Remove EFCS:           OK
[tcprewrite] Force TTL:             OK
[tcprewrite] Increase TTL:          OK
[tcprewrite] Reduce TTL:            OK
[tcprewrite] TOS test:              OK
[tcprewrite] MTU Truncate test:         OK
[tcprewrite] Substitute Src/Dst MAC test:   OK
[tcprewrite] Seeded MAC test:           OK
[tcprewrite] Seeded Keep MAC test:      OK
[tcprewrite] L7 fuzzing test:           OK
[tcprewrite] TCP sequence test:         OK
[tcprewrite] Fix checksum test:         OK
[tcprewrite] Fix length and pad test:       OK
[tcprewrite] Fix length and truncate test:  OK
[tcprewrite] Fix length and delete test:    OK
make[1]: Leaving directory `/home/user/src/tcpreplay-4.4.1/test'
fklassen commented 1 year ago

PR #742

This fix restores alignment for arm processors, and fixes L7 for FORCE_ALIGN builds. Although I was not able to reproduce original issues, it appears that this will fix the issue.

Verification

cbiedl commented 1 year ago

Thanks, successfully tested 4.4.2-beta1 on armhf and mipsel in Debian unstable.

About armhf, I reckon (but don't quote me on that) this depends on the actual CPU. Native armhf ones like Cortex-A7 do fine, but the Cortex-A72 in a Raspberry 4 (in an armhf chroot) enforces strict alignment. Unfortunately I cannot verify this, but I heard some talking of that kind.

fklassen commented 1 year ago

Very interesting. I never have been able to verify any alignment issues in my test environment, so this is good news. Maybe I should get the Cortex-A72 chroot environment.

I am waiting for news on #724 to determine if it is a bug or a feature. If it is a feature, I will push it out to 4.5 and release 4.4.2. In any case, fix should be GA within the next couple weeks.