opnsense / src

OPNsense operating system on top of FreeBSD
https://opnsense.org/
Other
354 stars 150 forks source link

dhclient does not decode 802.1q-encapsulated replies #114

Closed ipartola closed 3 years ago

ipartola commented 3 years ago

My ISP's DHCP server sends DHCPOFFER datagrams inside 802.1q-encapsulated frames with VLAN 0. It seems that OPNSense's dhclient ignores these. I found a similar issue in pfsense (https://redmine.pfsense.org/issues/8526) and a corresponding pull request (https://github.com/pfsense/FreeBSD-src/pull/9). The actual code diff is here: https://github.com/pfsense/FreeBSD-src/pull/9/commits/15051bfd014b2c1e6972a55f58e25cd6907cac8e

I don't see 802.1q handling in https://github.com/opnsense/src/blob/master/sbin/dhclient/bpf.c or https://github.com/opnsense/src/blob/master/sbin/dhclient/packet.c. It seems like this code should be added to OPNSense's dhclient code.

fichtner commented 3 years ago

Hi @ipartola,

VLAN 0 is a Cisco addition which there is no support in FreeBSD. While there is a patch it heavily complicates our BPF filter which already knows how to skip VLAN.

Does your ISP require you to run Cisco gear or can they disable the priority feature?

https://content.cisco.com/chapter.sjs?uri=/searchable/chapter/content/en/us/td/docs/ios-xml/ios/atm/configuration/15-mt/atm-15-mt-book/atm-15-mt-book_chapter_011000.html.xml

TBH, this also breaks QinQ so the question is why is this forced on customers in the first place.

Cheers, Franco

fichtner commented 3 years ago

PS: The VLAN priority reference is here and not to be confused with VLAN 0 hijacking https://github.com/opnsense/src/commit/5e4e4f842b7

ipartola commented 3 years ago

@fichtner I have no Cisco gear, though I suspect they use it on their end. My whole setup consists of a XGS-PON ONT (model FOX222) and an amd64 box with an Intel dual NIC running OPNSense.

The behavior I see is that dhclient sends out a DHCPDISCOVER and the ISP's DCHP server responds with a DHCPOFFER which dhclient ignores entirely:

01:09:47.918351 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from a0:ce:c8:01:05:8b (oui Unknown), length 300

01:09:47.935354 IP 32.217.174.1.bootps > 32.219.252.142.bootpc: BOOTP/DHCP, Reply, length 300

Looking at tcpdump further I get

11:56:43.560290 a4:7b:2c:29:53:74 > 80:61:5f:08:2d:7a, ethertype 802.1Q (0x8100), length 346: vlan 0, p 7, ethertype IPv4, 32.219.248.1.67 > 32.219.250.238.68: BOOTP/DHCP, Reply, length 300 Which is consistent with the pfsense bug. The only work-around I've found searching the web is to put a managed switch between my ONT and the OPNSense WAN port that would strip the 802.1q tag, but that seems like a hardware solution to a software problem. My old OpenWRT router has no problem with the ISP's setup and gets an IPv4 address immediately.

fichtner commented 3 years ago

IMO just talk to your ISP as they shouldn't force a maybe-VLAN0 on you. You obviously don't need it to communicate with them so they can avoid it too.

VLAN 0 needs support in the kernel in general, not just dhclient. The ISP could encapsulate any traffic it deems priority and you will never see it.

There may be a way with a bridge and VLAN 0 (if 0 is a supported setting), but I would not recommend either way.

Cheers, Franco

fichtner commented 3 years ago

And for further reference a FreeBSD ticket from 2018: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224961

fichtner commented 3 years ago

Ok so this pertains to the read filter, not write filter. I suppose we could patch it in but I don't trust a BPF filter that I haven't written...

ipartola commented 3 years ago

@fichtner thank you I appreciate the consideration! I am happy to help test if that would be useful. I will reach out to my ISP but I don't have high hopes for their ability to sort it out based on my experiences with their level 1 & 2 support thus far.

ipartola commented 3 years ago

@fichtner Two things I have learned:

  1. Unofficial reply regarding what my ISP is doing was: "VLAN0 is so 802.1p can be passed". I haven't gotten any other info from them regarding this behavior and honestly don't really understand what they are doing with 802.1p here.

  2. I tried the following patch on 21.1.5 and it worked like a charm: https://github.com/opnsense/src/commit/2d00172b5de7e1e9359cb37a913dde927f2507e5. If this looks good to you, I'm happy to make a pull request.

fichtner commented 3 years ago

@ipartola alright, I tried to remove some fluff from the original patch... can you try this one instead? bf0e9caf2a47f

I could provide a build too. It looks like it doesn't break the non-VLAN case, but I cannot easily test the VLAN-0 case.

Cheers, Franco

ipartola commented 3 years ago

@fichtner yes that works! Thank you!

fichtner commented 3 years ago

@ipartola hooray :) can you also try cbba3c3222fc26 on top... it's trying to clean up a few artefacts from when OpenBSD imported this in 2004 and nobody got rid of it apparently.

Cheers, Franco

ipartola commented 3 years ago

@fichtner Yes, that works as well!

fichtner commented 3 years ago

Ok, I'm expecting these to land in 21.7-RC1. This is a bit too dangerous for 21.1.x and we are only one month away from the RC1 anyway. Close then? :)

ipartola commented 3 years ago

That works for me. I am less affected since I can use the patched version so I can wait until whenever :).

Thanks again for fixing this!

fichtner commented 3 years ago

Thanks for bringing this to our attention in the first place. ❤️

lattera commented 3 years ago

Is this something that could be upstreamed?

fichtner commented 3 years ago

If someone dares to review and accept it in phabricator, possibly yes.

michaellacroix commented 3 years ago

@fichtner yes that works! Thank you!

How can I add this into my installation of pfsense? Thanks (Sorry I'm a noob in the github world)

fichtner commented 3 years ago

I’m not sure which patches pfSense uses but you can try sbin/dhclient from our Snapshot build (FreeBSD 12) https://pkg.opnsense.org/FreeBSD:12:amd64/snapshots/sets/base-21.7.b-amd64.txz

michaellacroix commented 3 years ago

Thanks so much,

Is this for a FreeBSD 12.2 base? When I try to install the package I get a “base-21.7.b-amd64.txz is not a valid package: no manifest found”.

I’m sure I’m doing something wrong.

Thanks, Mike

michaellacroix commented 3 years ago

I'm also having difficulty installing this on my opnsense install. Any help would be great. Thanks again.

michaellacroix commented 3 years ago

Have you seen this: https://github.com/pfsense/FreeBSD-src/tree/15051bfd014b2c1e6972a55f58e25cd6907cac8e

ipartola commented 3 years ago

@michaellacroix

How can I add this into my installation of pfsense? Thanks (Sorry I'm a noob in the github world)

This is probably not ideal to discuss on the OPNSense GitHub issues when it has to do with a completely separate project, and as I used OPNSense I can't tell you whether this will even work, but what I did that worked for me was:

  1. Clone the OPNSense repository to my OPNSense box git clone [GIT URL]
  2. Pull in the patched files into sbin/dhclient either with a text editor or just downloading them directly
  3. make && make install
  4. Release and renew your IP in the web UI

Note that this will only work to try the patched code since it'll be overwritten when you run updates.

fichtner commented 3 years ago

On OPNsense it's easy to install:

# opnsense-update -zbkr 21.7.b
# opnsense-shell reboot

Done :)

michaellacroix commented 3 years ago

Thanks so much fichtner!! And thank you also ipartola!

Works great for opnsense. Unfortunately I could not get this to work on pfsense as well. Pfsense does not have a compiler to use on it. Unless I can somehow use the opnsense package on pfsense?

fichtner commented 3 years ago

@michaellacroix Something like this probably:

# fetch https://pkg.opnsense.org/FreeBSD:12:amd64/snapshots/sets/base-21.7.b-amd64.txz
# tar vxf base-21.7.b-amd64.txz ./sbin/dhclient
# mv /sbin/dhclient /sbin/dhclient.orig
# cp ./sbin/dhclient /sbin/dhclient

Thanks for testing ❤️

michaellacroix commented 3 years ago

I'm very happy to test. Unfortunatly I hit a snag: image

I can use winscp to copy file, will that work?

fichtner commented 3 years ago

I did not expect you would run this from / root directory. In that case just run tar command again and it will extract to /sbin/dhclient directly. Please note that /sbin/dhclient.orig is not the original due to this.

Cheers, Franco

michaellacroix commented 3 years ago

Thanks so much fichtner. Unfortunately it did not work. If you have any other suggestions I would be happy to test.

fichtner commented 3 years ago

If you have an error I am happy to help, otherwise this is too vague to take a random guess.

Cheers, Franco

michaellacroix commented 3 years ago

Of course, I'll provide some cap files when I can. There's no error messages just unable to obtain an IP address from dhcp. Thanks, Mike

fichtner commented 3 years ago

There's no use in running pcaps since we know dhclient binary works which makes this either an issue with shared libraries (no idea if 2.6.x uses FreeBSD 12.2 or not). You can test with:

# ldd /sbin/dhclient

Otherwise it's a configuration error should pfSense use other syntax that is not included in FreeBSD 12.1.

That's all it could be really. :)

Cheers, Franco

heyhewmike commented 2 years ago

Hello, I am presently on opnsense 22.1.7 and unable to get an IP Address from my ISP connected to an XGS-PON ONT (model FOX222). My ISP appears to be sending 802.1Q VLan Packets with an ID of 0(Zero). image

Has this fix been implemented in Release 22.1.7?

fichtner commented 2 years ago

Yes. It’s also included in FreeBSD 13.1 now.

heyhewmike commented 2 years ago

Yes. It’s also included in FreeBSD 13.1 now.

Thank you.

janstadt commented 2 months ago

Does anyone using Frontier Fiber know how to get this all to work? I have an ONT and frontier provided router and i want to get rid of that and go directly from ONT to Opnsense. Does anyone know if this works? I tried and wasnt successful but i dont know if its due to this specific issue or not. How would one go about determining this?

fichtner commented 2 months ago

You need to capture the packets in front of the frontier provided router to see what it's doing. Not sure if related to this or not. It differs from provider to provider.

janstadt commented 2 months ago

Thanks @fichtner. So wireshark on some machine connected directly to the ONT would tell me what i need? I cant believe how these providers are doing this all. There shouldnt be any reason to have to have 2 boxes (ONT and their provided router) in front of my stuff just to make things work. This is me venting about the industry and nothing directed at you or opnsense. Called their tech support and they were absolutely useless (basically what i expected). Theres gotta be someone else out there with this setup that can hopefully provide me with a decent writeup or some link that explains what i need to do.

ipartola commented 2 months ago

@janstadt This setup currently works for me with stock opnsense 24.1. I guess the patch was included in FreeBSD sometime in 2022. In theory you shouldn't need anything besides just connecting your box to the ONT and firing up the DHCP client. In practice your issue could be anything ranging from plugging in the wrong port or using a bad cable to funky firewall rules.

It's been a few years since I had to deal with this, thank Zeus, but basically what I did was SSH to my opnsense box and run tcpdump listening on port 67. I would stop the dhclient service and then run it from a separate shell so I would only get my specific request traffic. What I saw was that the response from Frontier's DHCP server was tagged with VLAN 0 and opnsense's dhclient not responding to it, which led me down the rabbit hole with the similar issue with pfsense, etc.

I would direct you to https://forum.opnsense.org/index.php?board=1.0 for general support if this isn't your issue, folks there are very helpful.

janstadt commented 2 months ago

Thanks @ipartola. I just commended on a frontier post in the opnsense forums and will see if that gets me anywhere. I am certain the ports and cables are fine. Firewall might be another thing. I have a few rules, but most are vanilla as well. I've also enabled/disabled crowdsec without any success. Do you have frontier as a provider as well? I'll keep digging. Im sure theres an answer out there.

ipartola commented 2 months ago

@janstadt I do have Frontier in CT, same setup as when I originally opened the ticket and everything does work for me with no issues. My guess is that this isn't your issue but try to capture some traffic to confirm what's going on. The command I used:

tcpdump -len -i igb0 | grep BOOTP

where igb0 is my WAN interface name.

heyhewmike commented 2 months ago

@janstadt

I had this same issue Spring 2022 when I had a pcengines board as my SOC for OPNSense.

I never had it working correctly with Frontier in CT and had no support from Frontier. I saw it was because of how Frontier had been dealing with their DHCP server requests and responses and VLan tagging.

I personally gave up on Frontier and returned to Comcast without any issues.

I am now running a Dell XPS for OPNSense and not sure I will try going back to Frontier.