Closed OpenSource03 closed 2 years ago
Actually, the issue above is that I tried to run it from non-root user, lol.
As I tried again, on a fresh installed OVH Ryzen 7 with 5.4.0-90-generic I experienced the following issue:
root@ns3169717:/home/ubuntu# xdpfw
Loaded XDP program in DRV/native mode.
Segmentation fault (core dumped)
Can we find the issue somehow?
Hey @OpenSource03,
Are you able to run gdb
to see if you can retrieve a backtrace (you can install it on Debian/Ubuntu via apt install gdb
usually)?
You should be able to run:
gdb xdpfw
Afterwards, you'll be in the GDB interactive terminal. From there, you should be able execute run
to execute the program. Once/if the program seg faults, you should be able to see the address it's seg faulting on along with some data. If you can, run bt
to retrieve the backtrace and paste the results here if you can.
Can you also provide the outputs from /etc/xdpfw/xdpfw.conf
?
Thank you!
Hello! Sorry for the delay.
(gdb) run
Starting program: /usr/bin/xdpfw
Loaded XDP program in DRV/native mode.
Program received signal SIGSEGV, Segmentation fault.
0x000000000041f93f in main ()
(gdb) bt
#0 0x000000000041f93f in main ()
interface = "enp1s0f0";
updatetime = 15;
filters = (
{
enabled = true,
action = 1
}
);
Could be due to OVH's very weird network setup?
Here's my /etc/netplan/50-cloud-init.yaml
network:
version: 2
ethernets:
enp1s0f0:
accept-ra: false
addresses:
- 2001:41da:800:142e::/56
dhcp4: true
gateway6: 2001:41da:800:14ff:ff:ff:ff:ff
match:
macaddress: l0:50:99:dt:ae:c3
nameservers:
addresses:
- 2001:41d0:3:163::1
routes:
{ . . . unimportant }
set-name: enp1s0f0
^^ Some of the addresses above are faked for security reasons.
Could that be due to dhcp4 that it's using to retrieve the ipv4 address or something...?
Hey and no worries! I doubt the NetPlan config is the issue here and the XDP Firewall doesn't try to retrieve anything like IP addresses, gateway IPs, and so on.
I tried installing the XDP Firewall on a vanilla Ubuntu VM with the config you have. However I did not have a seg fault.
I see it actually loads the XDP program itself using the DRV hook. How long does it take for the program to seg fault afterwards?
Hey, if you can, may you also change the line here:
https://github.com/gamemann/XDP-Firewall/blob/master/Makefile#L33
To:
$(CC) -O0 -g $(LDFLAGS) $(INCS) -o $(BUILDDIR)/$(XDPFWOUT) $(LIBBPFOBJS) $(OBJS) $(SRCDIR)/$(XDPFWSRC)
The -g
and -O0
flags allows for more debugging symbols. Afterwards, do make && sudo make install
and then run gdb
again along with retrieving the backtrace via bt
.
Thank you!
Hey @OpenSource03,
I just wanted to follow up to see if you've read my previous replies regarding the segmentation fault issue.
Thank you!
Hello there.
I'm incredibly sorry for the delay, completely forgot. Take a look at the logs below:
root@ns3169717:/home/ubuntu/XDP-Firewall# gdb xdpfw
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from xdpfw...
(gdb) run
Starting program: /usr/bin/xdpfw
Loaded XDP program in DRV/native mode.
Program received signal SIGSEGV, Segmentation fault.
0x000000000041f93f in main (argc=1, argv=0x7fffffffe628) at src/xdpfw.c:505
505 allowed += stats[i].allowed;
(gdb) bt
#0 0x000000000041f93f in main (argc=1, argv=0x7fffffffe628)
at src/xdpfw.c:505
(gdb)
Regards
If you want and have time, I can send you details of the machine so you can test it there.
Should you need that, just let me know about your email/discord or any other communication method you may prefer.
Hey @OpenSource03! No worries on the delay!
This is very strange behavior. I wasn't able to replicate the seg fault itself in my test environment, but I did push a commit (see it referenced above) that has the stats
array allocated with the max CPU limit (256) and then the for loop only loops through the CPU count itself. I also check if the index in the stats
array is NULL
and if it is, skip that CPU ID and log to stderr
.
Would you be able to try this commit? You can do git pull
to retrieve the latest commit. You shouldn't need to reset anything since I didn't make any changes to the Makefile
, but if it complains, you can just do git reset --hard
before git pull
. If possible, please keep the flags I mentioned above (-g -O0
) in the Makefile before compiling and use gdb
to launch the program again.
Thank you!
Hi, take a look at this. Yes, I can run the app one time where I get the error above. Now, if I try to do so again, I'll get the error below. The only way to get it back "working" (or at least to the previous error state) is to reboot the machine.
libbpf: Kernel error message: XDP program already attached
Could not attach with DRV/native mode (Device or resource busy)(-16).
libbpf: Kernel error message: native and generic XDP can't be active at the same time
Could not attach with SKB/generic mode (File exists)(-17).
Error attaching XDP program :: File exists (17)
This also translates to the following in qdb:
(gdb) start
Temporary breakpoint 1 at 0x41eb79: file src/xdpfw.c, line 297.
Starting program: /usr/bin/xdpfw
Temporary breakpoint 1, main (argc=1, argv=0x7fffffffe628) at src/xdpfw.c:297
297 struct cmdline cmd =
(gdb) bt
#0 main (argc=1, argv=0x7fffffffe628) at src/xdpfw.c:297
The solution from #9 doesn't work in my case. Guess that it's because it's accessing XDP in native mode.
Also, is simply git pull (which btw works) and make && make install
enough to process the update? Or something else may be needed as well?
Also, I'm surprised that it's saying libbpf: Kernel error message: XDP program already attached
as I did only run it in qdb, which said that it "killed" the process as I exited it.
Hey @OpenSource03,
The program itself attaches the XDP/BPF program to the interface. It needs to be cleaned up manually via this line:
https://github.com/gamemann/XDP-Firewall/blob/master/src/xdpfw.c#L535
However, since the program is seg faulting for you before this line, it never gets to this point.
Thankfully, you do not need to restart the machine every time to detach the XDP program. Instead, you can use the following iproute2
command(s) as root to detach the XDP program from the interface.
# For unloading the XDP program in DRV mode (what it appears you'll want to run).
ip link set <interface> xdpdrv off
# For unloading the XDP program in SKB mode.
ip link set <interface> xdpgeneric off
Would you be able to use the run
command in gdb
instead of start
? It appears to be making a temporary breakpoint for some reason.
Also, can you provide the output of the following commands?
# Kernel version.
uname -r
# OS details.
cat /etc/*-release
# Clang version.
clang --version
# LLC/LLVM version.
llc --version
As for testing on your server, we can do that if you'd like! My Discord tag is christian_#5073
. I do want to see if I can replicate it on a VM first once I gather the details from above! Feel free to DM them to me on Discord after adding me if you aren't comfortable sharing them here.
Thank you!
Hello there.
The latest commit fixed the issue. I've seen the changes and those seem pretty minor, but let's guess that the use of MAX_CPUS may have been the solution...
Below you can see the log of the commands should you still need that:
ubuntu@ns3169717:~$ uname -r
5.4.0-90-generic
ubuntu@ns3169717:~$ cat /etc/*-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.3 LTS"
NAME="Ubuntu"
VERSION="20.04.3 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.3 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
ubuntu@ns3169717:~$ clang --version
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
ubuntu@ns3169717:~$ llc --version
LLVM (http://llvm.org/):
LLVM version 10.0.0
Optimized build.
Default target: x86_64-pc-linux-gnu
Host CPU: znver2
Registered Targets:
aarch64 - AArch64 (little endian)
aarch64_32 - AArch64 (little endian ILP32)
aarch64_be - AArch64 (big endian)
amdgcn - AMD GCN GPUs
arm - ARM
arm64 - ARM64 (little endian)
arm64_32 - ARM64 (little endian ILP32)
armeb - ARM (big endian)
avr - Atmel AVR Microcontroller
bpf - BPF (host endian)
bpfeb - BPF (big endian)
bpfel - BPF (little endian)
hexagon - Hexagon
lanai - Lanai
mips - MIPS (32-bit big endian)
mips64 - MIPS (64-bit big endian)
mips64el - MIPS (64-bit little endian)
mipsel - MIPS (32-bit little endian)
msp430 - MSP430 [experimental]
nvptx - NVIDIA PTX 32-bit
nvptx64 - NVIDIA PTX 64-bit
ppc32 - PowerPC 32
ppc64 - PowerPC 64
ppc64le - PowerPC 64 LE
r600 - AMD GPUs HD2XXX-HD6XXX
riscv32 - 32-bit RISC-V
riscv64 - 64-bit RISC-V
sparc - Sparc
sparcel - Sparc LE
sparcv9 - Sparc V9
systemz - SystemZ
thumb - Thumb
thumbeb - Thumb (big endian)
wasm32 - WebAssembly 32-bit
wasm64 - WebAssembly 64-bit
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
xcore - XCore
Thanks!
Hello!
After rebooting a dedicated server, I'm getting
Error setting rlimit.
. Do you have any solutions?Regards