Igalia / snabb

Snabb Switch: Fast open source packet processing
Apache License 2.0
47 stars 5 forks source link

Intermitent workers crashes #1194

Open dpino opened 5 years ago

dpino commented 5 years ago

This is a bug I've seen some other times. I don't know how to reproduce, it just happens occasionally.

When running a snabb executable I got the following error:

$ sudo ./snabb lwaftr run --cpu 11 --conf lwaftr.conf
lwaftr.conf: loading compiled configuration from lwaftr.o
lwaftr.conf: compiled configuration is up to date.
Binding data-plane PID 1283 to CPU 11.
Bound main process to NUMA node: 1 (CPU 6)
snabb[1283]: segfault at (nil) ip 0x4471f2 sp 0x7ffce75ad6e0 code 1 errno 0

What is crashing is the worker process.

Before stumbling into this error, I was running the lwAFTR successfully for quite a long time (more than 20 min). After rebooting the system, the same executable worked OK again:

$ sudo ./snabb lwaftr run --cpu 11 --conf lwaftr.conf
lwaftr.conf: loading compiled configuration from lwaftr.o
lwaftr.conf: compiled configuration is up to date.
Binding data-plane PID 3179 to CPU 11.
Bound main process to NUMA node: 1 (CPU 6)
[mounting /var/run/snabb/hugetlbfs]
$ ls -l /var/run/snabb/
total 0
drwxr-xr-x 6 root root 420 Dec  7 15:15 3175
drwxr-xr-x 5 root root 180 Dec  7 15:15 3179
drwxr-xr-x 2 root root   0 Dec  7 15:15 hugetlbfs
drwxr-xr-x 3 root root  60 Dec  7 15:15 intel-mp

It seems the error is due to something becoming unstable in the system after running the lwAFTR for some time.