inverse-inc / packetfence

PacketFence is a fully supported, trusted, Free and Open Source network access control (NAC) solution. Boasting an impressive feature set including a captive-portal for registration and remediation, centralized wired and wireless management, powerful BYOD management options, 802.1X support, layer-2 isolation of problematic devices; PacketFence can be used to effectively secure networks small to very large heterogeneous networks.
https://packetfence.org
GNU General Public License v2.0
1.37k stars 287 forks source link

PF 13.2 - Relayed DHCP does not complete. Final ACK is sent to source IP of REQ #8281

Closed bbs2web closed 1 month ago

bbs2web commented 2 months ago

Hi,

We're implementing a routed PacketFence setup where client are dropped in to various VLANs via RADIUS messaging. The user networks (eg corporate, BYOD, guest) have their DHCP answers provided directly by the routers at each site and registration/isolation networks have DHCP requests relayed to PacketFence's registration or isolation interfaces.

The routers utilise VRFs, so they by default generate the DHCP relay requests from their upstream management source IP but packets are caught by a mangle firewall rule which forces them in to the client's VRF routing table. Packets subsequently arrive on the correct PF interfaces and the DHCP service on PacketFence works perfectly in providing an offer sent to the relay source, based on the packet content address for the relay, instead of the packet source address. It however changes behaviour when the client then attempts to request the offered IP, whereby the ACK is sent to the source IP of the packet (instead of the relay source IP).

I hope the following simple ASCII diagram explains the topology:

Client
   |
   |
Remote Router
   |-------------------------------------|
   |                                     |
Management                     Corporate Network (VRF)
                                         |
                                         |
                         Core Router ----|
                             |
                             | 
                 PacketFence's registration interface

The last change I could find in the code base, relating to the giaddr or src IP of the packet being used in DHCP responses, was the following patch in April of this year: https://github.com/inverse-inc/packetfence/pull/8005/commits/468b2fd964e567a415e946f17c469f14b2a26c41

Doesn't appear to be directly relate, but perhaps something does relate?

fdurand commented 2 months ago

It should be addressed with that PR https://github.com/inverse-inc/packetfence/pull/8005/files And you should be able to select from which ip the dhcp server needs to reply on. image

bbs2web commented 2 months ago

This is working with some older PF deployments, but 13.2 with all updates installed exhibits problems. Checking for example '/usr/local/pf/lib/pf/constants/dhcp.pm' shows it including the changes from the referenced patch.

Please note that the offer is correctly sent to the giaddr, just not the ack in response to the subsequent request.

Herewith a snippet from the UI, showing 'dhcp_reply_ip=giaddr' having been set:

image

bbs2web commented 1 month ago

Herewith the sections of the networks.conf file:

The registration interface on the PacketFence instance:

[192.168.142.0]
dhcp_default_lease_time=30
type=vlan-registration
nat_dns=disabled
dhcp_end=192.168.142.118
named=enabled
pool_backend=memory
netmask=255.255.255.128
domain-name=vlan-registration.redacted.com
fake_mac_enabled=disabled
netflow_accounting_enabled=disabled
dhcp_max_lease_time=30
dns=192.168.142.2
dhcp_start=192.168.142.10
nat_enabled=disabled
gateway=192.168.142.2
dhcpd=enabled
coa=disabled
split_network=disabled

The routed network definition, containing dhcp_reply_ip=giaddr:

[192.168.140.0]
next_hop=192.168.142.1
network=192.168.140.0
netmask=255.255.255.128
gateway=192.168.140.1
domain-name=vlan-registration.redacted.com
dns=192.168.142.2
dhcpd=enabled
dhcp_start=192.168.140.4
dhcp_end=192.168.140.126
dhcp_default_lease_time=30
dhcp_max_lease_time=30
type=vlan-registration
fake_mac_enabled=0
named=enabled
nat_enabled=0
netflow_accounting_enabled=disabled
nat_dns=1
pool_backend=memory
dhcp_reply_ip=giaddr
coa=disabled
algorithm=1
fdurand commented 1 month ago

ok i think i see the issue, let me work on that and provide a patch.

fdurand commented 1 month ago

Is it possible to try that on your setup ?

cd /usr/local/pf
curl https://github.com/inverse-inc/packetfence/compare/fix/pfdhcp_giaddr.diff | patch -p1 --dry-run
if no conflict:
curl https://github.com/inverse-inc/packetfence/compare/fix/pfdhcp_giaddr.diff | patch -p1
cd /usr/local/pf
make go-env
source ~/.bashrc
make pfdhcp
systemctl stop packetfence-pfdhcp
./pfdhcp

Let me know if it works. Btw i added logging and you can track it with journalctl -f |grep pfdhcp

bbs2web commented 1 month ago

That works perfectly, just had to append '/go' to the cd /usr/local/pf command though. Herewith the debug logs when I stop the service and run the custom compiled pfdhcp binary:

Sep 11 22:00:57 packetfence pfdhcp[414171]: t=2024-09-11T22:00:57+0200 lvl=dbug msg="de:ad:be:ef:de:ad Discover xID bf:1d:6a:63" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:57 packetfence pfdhcp[414171]: t=2024-09-11T22:00:57+0200 lvl=info msg="DHCPDISCOVER from de:ad:be:ef:de:ad (Redacted)" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:57 packetfence pfdhcp[414171]: t=2024-09-11T22:00:57+0200 lvl=dbug msg="Search in the cache if an IP has already been assigned" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:57 packetfence pfdhcp[414171]: t=2024-09-11T22:00:57+0200 lvl=dbug msg="Not Found in the cache that a IP has already been assigned" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:57 packetfence pfdhcp[414171]: t=2024-09-11T22:00:57+0200 lvl=dbug msg="Search if there is still available IP in the pool" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:57 packetfence pfdhcp[414171]: t=2024-09-11T22:00:57+0200 lvl=dbug msg="Still available IP in the pool" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:57 packetfence pfdhcp[414171]: t=2024-09-11T22:00:57+0200 lvl=dbug msg="Grabbing next available IP" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="de:ad:be:ef:de:ad Discover xID bf:1d:6a:63" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=info msg="DHCPDISCOVER from de:ad:be:ef:de:ad (Redacted)" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="Search in the cache if an IP has already been assigned" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="Found in the cache that a IP has already been assigned" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=info msg="DHCPOFFER on 192.168.140.108 to de:ad:be:ef:de:ad (Redacted)" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="Giaddr 192.168.140.1" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="L3 - reply to giaddr 192.168.140.1" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=info msg="DHCPOFFER on 192.168.140.108 to de:ad:be:ef:de:ad (Redacted)" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="Giaddr 192.168.140.1" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="L3 - reply to giaddr 192.168.140.1" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="de:ad:be:ef:de:ad Request xID bf:1d:6a:63" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=info msg="DHCPREQUEST for 192.168.140.108 from de:ad:be:ef:de:ad (Redacted)" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=info msg="sql: no rows in result set" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=info msg="sql: no rows in result set" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=info msg="DHCPACK on 192.168.140.108 to de:ad:be:ef:de:ad (Redacted)" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="Giaddr 192.168.140.1" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:00:58 packetfence pfdhcp[414171]: t=2024-09-11T22:00:58+0200 lvl=dbug msg="L3 - reply to giaddr 192.168.140.1" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:01:07 packetfence pfdhcp[414171]: t=2024-09-11T22:01:07+0200 lvl=dbug msg="de:ad:be:ef:de:ad Release xID bf:1d:6a:63" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:01:07 packetfence pfdhcp[414171]: t=2024-09-11T22:01:07+0200 lvl=info msg="DHCPRELEASE for 192.168.140.108 from de:ad:be:ef:de:ad ()" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:01:07 packetfence pfdhcp[414171]: t=2024-09-11T22:01:07+0200 lvl=dbug msg="DHCPRELEASE Found the ip 192.168.140.108 in the cache" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:01:07 packetfence pfdhcp[414171]: t=2024-09-11T22:01:07+0200 lvl=info msg="Temporarily declaring 192.168.140.108 as unusable" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:01:07 packetfence pfdhcp[414171]: t=2024-09-11T22:01:07+0200 lvl=info msg="DHCPRELEASE of 192.168.140.108 from de:ad:be:ef:de:ad" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:01:07 packetfence pfdhcp[414171]: t=2024-09-11T22:01:07+0200 lvl=info msg="de:ad:be:ef:de:ad 192.168.140.108 Added back in the pool none on index 104" pid=414171 mac=de:ad:be:ef:de:ad
Sep 11 22:01:37 packetfence pfdhcp[414171]: t=2024-09-11T22:01:37+0200 lvl=info msg="Releasing previously released IP 192.168.140.108 back into the pool" pid=414171 mac=de:ad:be:ef:de:ad

I've scrubbed the above (replacing the client hostname and MAC) also removed numerous log entries such as these:

pfdhcp[414171]: t=2024-09-11T22:01:38+0200 lvl=dbug msg="Pinged DB" pid=414171 mac=de:ad:be:ef:de:ad
pfdhcp[414171]: t=2024-09-11T22:01:46+0200 lvl=dbug msg="Resource is not valid anymore. Was loaded at 0001-01-01 00:00:00 +0000 UTC" pid=414171 mac=de:ad:be:ef:de:ad PfconfigObject=hash_element|resource::clusters_hostname_map();packetfence

Stopping the binary from running and restarting the service returns the behaviour where this is broken. Can I simply swap out the pfdhcp binary or will this interfere with a future patch?

[root@packetfence pf]# cd /usr/local/pf
[root@packetfence pf]# find . -name pfdhcp -type f -print0 | xargs -r0 ls -l
-rwxr-xr-x 1 root root 10553105 Sep 11 21:40 ./go/pfdhcp
-rwxr-xr-x 1 pf   pf    7153912 Aug 23 13:54 ./sbin/pfdhcp

PS: Many thanks for your assistance!

fdurand commented 1 month ago

yes you can just swap the pfdhcp binary and restart packetfence-pfdhcp. I will open the PR and backport it in the maintenance.