dfskoll / rp-pppoe

Public repository for RP-PPPoE PPPoE client and server software
https://dianne.skoll.ca/projects/rp-pppoe/
55 stars 18 forks source link

PPPoE server IPv6 readiness? #37

Closed marekm72 closed 1 week ago

marekm72 commented 1 week ago

More of a question/discussion rather than an issue - I'm looking for alternatives to accel-ppp as PPPoE server for a small local ISP (one-person business with just a few hundreds of customers and not growing much, so can't really afford any commercial solutions).

While accel-ppp mostly works for me, it's quite complex and tries to do everything, not just PPPoE server but also L2TP/PPTP/SSTP/IPoE.

What I need:

MikroTik did all fine for IPv4 but no proper IPv6 PD which was the main reason for moving to VyOS + accel-ppp. But accel-ppp has trouble with some routers that don't support IPv6 properly but try to negotiate it anyway and then fail, I had to work around it by layer 2 blocking of IPv6CP from certain MAC OUI prefixes of the broken routers.

There was long long ago some commercial PPPoE server advertised in the rp-pppoe documentation, not sure what is its current status, is it still proprietary or has it been open sourced?

jkroonza commented 1 week ago

There is nothingrequired on pppoe side to deal with IPv6. It doesn't even deal with IPv4. That's all handled by the pppd component that gets fired up by pppoe-server.

You can certainly run ospfv3 on top of that with some trickery (we use frr and whilst we don't use ospfv3 here since generally on the other side of that link is an "untrusted" customer) we do use frr to add and remove ipv6 static routes (based on information from Radius).

I personally could not find an ipv6 dhcp server, and kea doesn't work in the use-case, and so I started (but never got to a production point) with a dhcpv6 proxy which the intent was (is) to be able to dynamically add/remove devices, then proxy to some other DHCPv6 server, forward the response and add the relevant routing (https://github.com/jkroonza/dhcpv6-relay). Don't think it's well documented, most (if not all) of the design is in my head too. Given that I was on a path of "figuring it out as I go".

pppoe-server has dropped user-mode support completely, so without kernel mode you're dead in the water. I do suggest you ensure (for larger scale deployments) ensure your kernel has the CONFIG_PPPOE_HASH_BITS option, which allows you to increase increase from the pre-that-patch fixed 4 buckets to 256, which massively reduces CPU usage even in kernel-mode at a very minor CPU cost. For us at least that hash probably needs to be redone at some point to allow for an arbitrary number of hash bits, but for now 8 (max the underlying hash supports, and is a power of two, is sufficient, however, just expanding the hash and increasing to 16 bits is a waste of memory).

You're going to struggle to completely get away from all the various buggyness. Basically pppd needs to just negotiate LL for IPv6, then other mechanisms take over. So if those routers are failing on IPv6CP they are VERY BADLY BROKEN since the only thing that really negotiates, is the Link-Local addresses. From there on in it's all higher layer protocols like DHCPv6.

I'm not aware of a mechanisms that will load the IPv6 PD from radius into DHCPv6, but since this is exactly what ppp-accel does it has to be possible. I'm betting (but never confirmed) they're using radvd in some way combined with reconfiguration of dhcpv6 server on the fly.

Sorry, that's the full scope of my knowledge on the subject for you.

dfskoll commented 1 week ago

Just to add to what Jaco wrote...

While pppoe-server does have some options to set an IPv4 address pool (with addresses from the pool being passed along to the underlying pppd process with the local_ip:remote_ip option, it doesn't have support for a similar IPv6 address pool. That would not be hard to add since pppd does support an ipv6 local:remote option. For now, you have to assign IPv6 addresses some other way, probably via a RADIUS server and the pppd RADIUS plugin.

Regards,

Dianne.

jkroonza commented 1 week ago

@dfskoll I think it would be completely wrong to do so.

ipv6 ,

Those are 64-bit identifiers, and in the usual case consume the interface number from the kernel (if I recall correctly) to create a LL address of the form fe80::${identifier}/64. The value of these does not matter, they are used or exchanging Link-specific IPv6 traffic, usually:

For normal IPv6 operation on ppp you don't even need global addresses on the link itself, but it's possible to do that by way of RADVD or FRR (probably amongst others) on the "server" side, and this is workable, but generally not particularly useful, and as such DHCPv6 (which is based on multicast) is generally used to delegate a prefix, typically /56 or /48. Since any LIR gets a /32 even a /48 (64k networks on the recipient side) is perfectly workable, that allows you as an ISP to provision 64k clients. That said, for the time being we're aiming to push /56s, which still allows 256 networks internally to the client. If we have a client with greater need than that, chances are they also really need those ranges to be perfectly static in which case we'll happily delegate a /48 to them via Radius (which somehow we will need to integrate with some form of DHCPv6 more so than raw proxy, alternatively, by way of static route addition using the mechanism we're using currently).

dfskoll commented 1 week ago

@dfskoll I think it would be completely wrong to do so.

Oh yeah, I never said it was a good idea. :slightly_smiling_face: Just that it was possible.

jkroonza commented 1 week ago

@dfskoll I think it would be completely wrong to do so.

Oh yeah, I never said it was a good idea. 🙂 Just that it was possible.

Fair enough, my bad for misinterpreting.

marekm72 commented 1 week ago

Thanks for the responses. So the prefix delegation part is difficult, accel-ppp comes with its own built-in dhcpv6 to handle that and there seems to be no easy replacement for that. Yes the offending routers (Phicomm KE2M) are badly broken, in such an interesting way that they don't even advertise any IPv6 support anywhere on the box or in the web GUI, work fine as IPv4-only PPPoE clients, but break when optional IPv6 is enabled on the PPPoE server. It came as a surprise as it's actually a combination of two bugs where each one separately would be harmless: routers with broken IPv6 negotiation, and accel-ppp not following the Implementation Note in RFC1661 section 3.7. So the best options I have is to understand accel-ppp code (fairly complex but I'm not giving up yet) well enough to fix the bug, or send the broken routers to e-waste and replace with working ones.

jkroonza commented 1 week ago

@marekm72 I can't really help you beyond I would approach it:

tcpdump, tcpdump, tcpdump.

I promise you they had something working somewhere else they'd not release it. If you're not familiar with IPv6 address provisioning, things get difficult, so I'd figure out where it first breaks. And checking if IPv6CP completes (ie, LL gets assigned) is step one.

If that works, you want to know if it's soliciting RAs, in which case you need to feed it RAs which indicates to use managed configuration, but I've never needed this.

If you're seeing DHCPv6 packets, you need to decode them (tcpdump) and figure out what the router is asking for, and arrange for that to be delivered.

tcpdump is useful for all of that, but pppd debug option may be even simpler (this will log the initial NCP configuration packets). You could perhaps patch pppd to sort this out at this layer already, but I'd wager that it might already be. Without more information, this is as much as I (or anyone else I suspect) can help.