lf-edge / eden

Eden is where EVE and Adam get tried and tested:
https://projecteve.dev
Apache License 2.0
50 stars 48 forks source link

Networking test flakiness #950

Closed uncleDecart closed 2 months ago

uncleDecart commented 9 months ago

This is a tracker to collect all failed Networking test Suite runs in GitHub Actions.

How to contribute?

If you see any failure in GitHub actions runs, please add it to this table.

How to find eden version?

Open workflow file of the run and see which eden tag is invoked, like here it is 0.9.3.

How to find EVE version?

In setup-job of workflow which failed find eve image parameter which in inputs. For example, here it is evebuild/pr:3698. In case if it contains evebuild/pr put PR in it.

Link to run Eden version EVE version test
1 0.9.5 PR switch_net_vlans
2 0.9.3 PR ?
3 0.9.5 11.7.0 ?

How to run suite locally

Clean previous build and add default config with debug level of printing

make clean && make build-tests
./eden config add default
./eden config set default --key eve.log-level --value debug

(optional) set EVE tag to a version you're testing

./eden config set default --key eve.tag --value 11.3.0

Setup and run a test

./eden setup
./dist/bin/eden+ports.sh 2223:2223
./eden start
./eden eve onboard
./eden test ./tests/workflow -s networking.tests.txt -v debug
milan-zededa commented 8 months ago

switch_net_vlans fails because the shim VM of the application app2 crashes. This started happening between EVE versions 11.7.0 and 11.8.0.

content: [    1.120978][    T1] Run /init as init process
content: [    1.312358][  T354] 8021q: adding VLAN 0 to HW filter on device eth0
content: udhcpc: started, v1.35.0
content: udhcpc op deconfig interface eth0
content: udhcpc: broadcasting discover
content: udhcpc: broadcasting select for 10.1.0.4, server 10.1.0.1
content: udhcpc: lease of 10.1.0.4 obtained from 10.1.0.1, lease time 3600
content: udhcpc op bound interface eth0
content: [    1.700154][  T376] 8021q: adding VLAN 0 to HW filter on device eth1
content: udhcpc: started, v1.35.0
content: udhcpc op deconfig interface eth1
content: udhcpc: broadcasting discover
content: [    1.806495][   T16] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
content: udhcpc: broadcasting discover
content: udhcpc: broadcasting select for 10.2.100.158, server 10.2.100.1
content: udhcpc: lease of 10.2.100.158 obtained from 10.2.100.1, lease time 3600
content: udhcpc op bound interface eth1
content: Mount /mnt/modules as /lib/modules, result 0
content: [    4.942553][  T390] wireguard: WireGuard 1.0.0 loaded. See www.wireguard.com for information.
content: [    4.942742][  T390] wireguard: Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
content: Modprobe wireguard, result 0
content: Executing /mount_disk.sh
content: /init: /mnt/environment: line 4: syntax error: unterminated quoted string
content: [    4.990780][    T1] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000200
content: [    4.991109][    T1] CPU: 0 PID: 1 Comm: init Not tainted 6.1.38-linuxkit-3e39cb4a2fc4 #1
content: [    4.991368][    T1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a-dirty-20240123_213457-buildkitsandbox-Xen 04/01/2014
content: [    4.991765][    T1] Call Trace:
content: [    4.991944][    T1]  <TASK>
content: [    4.992026][    T1]  dump_stack_lvl+0x45/0x5e
content: [    4.992171][    T1]  panic+0x10f/0x2b2
content: [    4.992297][    T1]  do_exit+0x1c7/0x911
content: [    4.992397][    T1]  ? __fget_light+0x29/0x4d
content: [    4.992525][    T1]  do_group_exit+0x7a/0x7a
content: [    4.992789][    T1]  __x64_sys_exit_group+0x14/0x14
content: [    4.993056][    T1]  do_syscall_64+0x6a/0x84
content: [    4.993184][    T1]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
content: [    4.993581][    T1] RIP: 0033:0x7fea28babf2f
content: [    4.993705][    T1] Code: 88 04 00 48 01 c7 e9 03 39 00 00 64 48 8b 04 25 00 00 00 00 48 8b b0 a8 00 00 00 e9 c0 ff ff ff 48 63 ff b8 e7 00 00 00 0f 05 <ba> 3c 00 00 00 48 89 d0 0f 05 eb f9 48 83 ec 38 bf 06 00 00 00 e8
content: [    4.994081][    T1] RSP: 002b:00007ffd45613298 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
content: [    4.994240][    T1] RAX: ffffffffffffffda RBX: 00007fea28b8e004 RCX: 00007fea28babf2f
content: [    4.994398][    T1] RDX: 00007fea28c268c0 RSI: 0000000000000000 RDI: 0000000000000002
content: [    4.994574][    T1] RBP: 0000000000000001 R08: 0000000000002000 R09: 00007fea28c20bb1
content: [    4.994721][    T1] R10: 000000000000001a R11: 0000000000000202 R12: 0000564e5eca6944
content: [    4.994867][    T1] R13: 00007ffd45613570 R14: 00007fea28c25ac0 R15: 00007fea28c25b48
content: [    4.995018][    T1]  </TASK>
content: [    4.995233][    T1] Kernel Offset: 0x31000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
content: [    4.995449][    T1] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000200 ]---
uncleDecart commented 2 months ago

Stabilised. Will open new ones if something occurs again