gamemann / XDP-Firewall

A firewall that utilizes the Linux kernel's XDP hook. The XDP hook allows for very fast network processing on Linux systems. This is great for dropping malicious traffic from a (D)DoS attack. IPv6 is supported with this firewall! I hope this helps network engineers/programmers interested in utilizing XDP!
https://deaconn.net/
MIT License
492 stars 87 forks source link

Segmentation fault (core dumped) #10

Closed OpenSource03 closed 2 years ago

OpenSource03 commented 2 years ago

Hello!

After rebooting a dedicated server, I'm getting Error setting rlimit.. Do you have any solutions?

Regards

OpenSource03 commented 2 years ago

Actually, the issue above is that I tried to run it from non-root user, lol.

As I tried again, on a fresh installed OVH Ryzen 7 with 5.4.0-90-generic I experienced the following issue:

root@ns3169717:/home/ubuntu# xdpfw
Loaded XDP program in DRV/native mode.
Segmentation fault (core dumped)

Can we find the issue somehow?

gamemann commented 2 years ago

Hey @OpenSource03,

Are you able to run gdb to see if you can retrieve a backtrace (you can install it on Debian/Ubuntu via apt install gdb usually)?

You should be able to run:

gdb xdpfw

Afterwards, you'll be in the GDB interactive terminal. From there, you should be able execute run to execute the program. Once/if the program seg faults, you should be able to see the address it's seg faulting on along with some data. If you can, run bt to retrieve the backtrace and paste the results here if you can.

Can you also provide the outputs from /etc/xdpfw/xdpfw.conf?

Thank you!

OpenSource03 commented 2 years ago

Hello! Sorry for the delay.

(gdb) run
Starting program: /usr/bin/xdpfw 
Loaded XDP program in DRV/native mode.

Program received signal SIGSEGV, Segmentation fault.
0x000000000041f93f in main ()

(gdb) bt
#0  0x000000000041f93f in main ()
interface = "enp1s0f0";
updatetime = 15;

filters = (
    {
        enabled = true,
        action = 1
    }
);
OpenSource03 commented 2 years ago

Could be due to OVH's very weird network setup? Here's my /etc/netplan/50-cloud-init.yaml

network:
    version: 2
    ethernets:
        enp1s0f0:
            accept-ra: false
            addresses:
            - 2001:41da:800:142e::/56
            dhcp4: true
            gateway6: 2001:41da:800:14ff:ff:ff:ff:ff
            match:
                macaddress: l0:50:99:dt:ae:c3
            nameservers:
                addresses:
                - 2001:41d0:3:163::1
            routes:
            { . . . unimportant }
            set-name: enp1s0f0

^^ Some of the addresses above are faked for security reasons.

Could that be due to dhcp4 that it's using to retrieve the ipv4 address or something...?

gamemann commented 2 years ago

Hey and no worries! I doubt the NetPlan config is the issue here and the XDP Firewall doesn't try to retrieve anything like IP addresses, gateway IPs, and so on.

I tried installing the XDP Firewall on a vanilla Ubuntu VM with the config you have. However I did not have a seg fault.

I see it actually loads the XDP program itself using the DRV hook. How long does it take for the program to seg fault afterwards?

gamemann commented 2 years ago

Hey, if you can, may you also change the line here:

https://github.com/gamemann/XDP-Firewall/blob/master/Makefile#L33

To:

$(CC) -O0 -g $(LDFLAGS) $(INCS) -o $(BUILDDIR)/$(XDPFWOUT) $(LIBBPFOBJS) $(OBJS) $(SRCDIR)/$(XDPFWSRC)

The -g and -O0 flags allows for more debugging symbols. Afterwards, do make && sudo make install and then run gdb again along with retrieving the backtrace via bt.

Thank you!

gamemann commented 2 years ago

Hey @OpenSource03,

I just wanted to follow up to see if you've read my previous replies regarding the segmentation fault issue.

Thank you!

OpenSource03 commented 2 years ago

Hello there.

I'm incredibly sorry for the delay, completely forgot. Take a look at the logs below:

root@ns3169717:/home/ubuntu/XDP-Firewall# gdb xdpfw
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from xdpfw...
(gdb) run
Starting program: /usr/bin/xdpfw 
Loaded XDP program in DRV/native mode.

Program received signal SIGSEGV, Segmentation fault.
0x000000000041f93f in main (argc=1, argv=0x7fffffffe628) at src/xdpfw.c:505
505                     allowed += stats[i].allowed;
(gdb) bt
#0  0x000000000041f93f in main (argc=1, argv=0x7fffffffe628)
    at src/xdpfw.c:505
(gdb) 

Regards

OpenSource03 commented 2 years ago

If you want and have time, I can send you details of the machine so you can test it there.

Should you need that, just let me know about your email/discord or any other communication method you may prefer.

gamemann commented 2 years ago

Hey @OpenSource03! No worries on the delay!

This is very strange behavior. I wasn't able to replicate the seg fault itself in my test environment, but I did push a commit (see it referenced above) that has the stats array allocated with the max CPU limit (256) and then the for loop only loops through the CPU count itself. I also check if the index in the stats array is NULL and if it is, skip that CPU ID and log to stderr.

Would you be able to try this commit? You can do git pull to retrieve the latest commit. You shouldn't need to reset anything since I didn't make any changes to the Makefile, but if it complains, you can just do git reset --hard before git pull. If possible, please keep the flags I mentioned above (-g -O0) in the Makefile before compiling and use gdb to launch the program again.

Thank you!

OpenSource03 commented 2 years ago

Hi, take a look at this. Yes, I can run the app one time where I get the error above. Now, if I try to do so again, I'll get the error below. The only way to get it back "working" (or at least to the previous error state) is to reboot the machine.

libbpf: Kernel error message: XDP program already attached
Could not attach with DRV/native mode (Device or resource busy)(-16).
libbpf: Kernel error message: native and generic XDP can't be active at the same time
Could not attach with SKB/generic mode (File exists)(-17).
Error attaching XDP program :: File exists (17)

This also translates to the following in qdb:

(gdb) start
Temporary breakpoint 1 at 0x41eb79: file src/xdpfw.c, line 297.
Starting program: /usr/bin/xdpfw 

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffe628) at src/xdpfw.c:297
297         struct cmdline cmd = 
(gdb) bt
#0  main (argc=1, argv=0x7fffffffe628) at src/xdpfw.c:297

The solution from #9 doesn't work in my case. Guess that it's because it's accessing XDP in native mode.

Also, is simply git pull (which btw works) and make && make install enough to process the update? Or something else may be needed as well?

OpenSource03 commented 2 years ago

Also, I'm surprised that it's saying libbpf: Kernel error message: XDP program already attached as I did only run it in qdb, which said that it "killed" the process as I exited it.

gamemann commented 2 years ago

Hey @OpenSource03,

The program itself attaches the XDP/BPF program to the interface. It needs to be cleaned up manually via this line:

https://github.com/gamemann/XDP-Firewall/blob/master/src/xdpfw.c#L535

However, since the program is seg faulting for you before this line, it never gets to this point.

Thankfully, you do not need to restart the machine every time to detach the XDP program. Instead, you can use the following iproute2 command(s) as root to detach the XDP program from the interface.

# For unloading the XDP program in DRV mode (what it appears you'll want to run).
ip link set <interface> xdpdrv off

# For unloading the XDP program in SKB mode.
ip link set <interface> xdpgeneric off

Would you be able to use the run command in gdb instead of start? It appears to be making a temporary breakpoint for some reason.

Also, can you provide the output of the following commands?

# Kernel version.
uname -r  

# OS details.
cat /etc/*-release

# Clang version.
clang --version

# LLC/LLVM version.
llc --version

As for testing on your server, we can do that if you'd like! My Discord tag is christian_#5073. I do want to see if I can replicate it on a VM first once I gather the details from above! Feel free to DM them to me on Discord after adding me if you aren't comfortable sharing them here.

Thank you!

OpenSource03 commented 2 years ago

Hello there.

The latest commit fixed the issue. I've seen the changes and those seem pretty minor, but let's guess that the use of MAX_CPUS may have been the solution...

Below you can see the log of the commands should you still need that:

ubuntu@ns3169717:~$ uname -r 
5.4.0-90-generic
ubuntu@ns3169717:~$ cat /etc/*-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.3 LTS"
NAME="Ubuntu"
VERSION="20.04.3 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.3 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
ubuntu@ns3169717:~$ clang --version
clang version 10.0.0-4ubuntu1 
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
ubuntu@ns3169717:~$ llc --version
LLVM (http://llvm.org/):
  LLVM version 10.0.0

  Optimized build.
  Default target: x86_64-pc-linux-gnu
  Host CPU: znver2

  Registered Targets:
    aarch64    - AArch64 (little endian)
    aarch64_32 - AArch64 (little endian ILP32)
    aarch64_be - AArch64 (big endian)
    amdgcn     - AMD GCN GPUs
    arm        - ARM
    arm64      - ARM64 (little endian)
    arm64_32   - ARM64 (little endian ILP32)
    armeb      - ARM (big endian)
    avr        - Atmel AVR Microcontroller
    bpf        - BPF (host endian)
    bpfeb      - BPF (big endian)
    bpfel      - BPF (little endian)
    hexagon    - Hexagon
    lanai      - Lanai
    mips       - MIPS (32-bit big endian)
    mips64     - MIPS (64-bit big endian)
    mips64el   - MIPS (64-bit little endian)
    mipsel     - MIPS (32-bit little endian)
    msp430     - MSP430 [experimental]
    nvptx      - NVIDIA PTX 32-bit
    nvptx64    - NVIDIA PTX 64-bit
    ppc32      - PowerPC 32
    ppc64      - PowerPC 64
    ppc64le    - PowerPC 64 LE
    r600       - AMD GPUs HD2XXX-HD6XXX
    riscv32    - 32-bit RISC-V
    riscv64    - 64-bit RISC-V
    sparc      - Sparc
    sparcel    - Sparc LE
    sparcv9    - Sparc V9
    systemz    - SystemZ
    thumb      - Thumb
    thumbeb    - Thumb (big endian)
    wasm32     - WebAssembly 32-bit
    wasm64     - WebAssembly 64-bit
    x86        - 32-bit X86: Pentium-Pro and above
    x86-64     - 64-bit X86: EM64T and AMD64
    xcore      - XCore

Thanks!