LibreQoE / LibreQoS

A Quality of Experience and Smart Queue Management system for ISPs. Leverage CAKE to improve network responsiveness, enforce bandwidth plans, and reduce bufferbloat.
https://libreqos.io/
GNU General Public License v2.0
447 stars 48 forks source link

Tracking issue: UISP integration and complex setups, relative to BracketOoS #140

Closed thebracket closed 4 months ago

thebracket commented 2 years ago

Setting this up as a tracking issue while I poke at UISP integration a bit. My intent is to gradually fix these issues and offer them up to LibreQoS (rather than just handing out my Rust-based tool and keeping it updated separately).

My "playpen" for working on this is here: https://github.com/thebracket/LibreQoS/tree/uisp-integration . My intent is to tackle issues there, and then turn them into merge-able PRs for Libre. Oh, and I grabbed a copy of this book from my publisher (I get e-books super-cheap since I work for them) and re-learned Python. :-)

One of the areas that took a lot of work with BracketQoS was getting our UISP setup to work with it. We run a mixed vendor network, with UISP handling billing/CRM even for the parts that are running Cambium, Mimosa, Mikrotik and a few others. When I run the 1.3 UISP integration script against our network:

Once those were out of the way, I still ran into some issues:

Niceties I'd like to try and arrange:

rchac commented 2 years ago

My intent is to gradually fix these issues and offer them up to LibreQoS (rather than just handing out my Rust-based tool and keeping it updated separately).

Thank you!

Some choice of topology. Bracket lets you pick "flat" (every customer parented off the root), "AP only" (APs are a top layer), "Site only" (sites are top level entries and every customer feeds off of the site) and "full" (which builds a complete topology graph between sites and maps the entire network).

That makes sense. I think adding a "flat" option would be great.

  • The bane of my existence, relays always break topology. (A "relay" being a customer fed via another customer). BracketQoS occasionally fails on these. I swear my colleagues come up with new and interesting topologies to install every time I take a day off.

My solution has been to create a UISP site for each repeater PoP and have the host household as a client of that site. It's flexible and allows operators to have complex relays with multiple APs and such. Is this a reasonable workaround? If not we can try to have it better accommodate these relay site cases.

image

Suspended customers. One thing we found useful with Preseem - and ported over to our version of BracketQoS - was the ability to set a "suspended customers get this much Internet" option. We'd pick a low number, so their service sucked rather than being off altogether (helpful if they have VoIP and you don't want to cut off 911, and if your "pay your bill!" page is offsite)

Hm, I just assumed suspension would be handled separately (we do redirect to payment portal via MikroTik) so I excluded suspended subscribers from even being shaped. This makes sense and wouldn't be that hard to implement. I think this is a good idea.

dtaht commented 2 years ago

Hilariously, I run out of bandwidth on celluar all the time, they actually rate limit it to about 2Mbits with sane buffering, and with cake in the way on my usb tether, I hardly notice. videoconference still "just work", web pages get slow, but I don't use the web much.

thebracket commented 2 years ago

Suspension is an odd one. We work with a third-party who provide VoIP to some of our customers, and they were pretty insistent on allowing 911 calls even if the Internet service is suspended. So we do the redirect also, but only for web traffic. (@dtaht would be able to do most things that weren't the web, and is smart enough to open a VPN... we don't block that, right now, so he'd have free service until our installer shows up for the gear... it's not perfect, but it's working)

The "site" model for relays is how you should do it, and we used to do it that way. We have something like 75 site-to-site relays now, and it became really unwieldy. So we have a bunch of client sites linked to other client sites. It's pretty ridiculous, but if I don't support it I get grumbles from down the hall...

A funny one. So a non-profit gets a big circuit from us. Easy - client site off of a tower. They realize that they really should be two non-profits and put up a building on the same site - which just happens to be inaccessible due to terrain. So now there's a relay from charity 1 to charity 2. Initially in the same client site because Charity 1 wanted to pay for it all. Of course, time passes and Charity 1 is complaining that Charity 2 are using all their bandwidth so they've agreed to pay for their own. No biggie, now Charity 2 is a client site - with its own bandwidth tracking. Another charity (they tend to cluster) sets up shop next to Charity 2, and want a relay too. So now Charity 1 has a site with 3 client sites coming off of it. And it just keeps going. There's something like 5 charities, 2 of the manager's houses, a church and a barn all linked up - sometimes daisy chained. Ugh.

Edit: forgot to mention that they are all in a bowl-shaped valley with conservation department rules prohibiting tower construction.

rchac commented 2 years ago

I feel you there, building towers is pretty much a no-go where we are thanks to zoning, though we are considering OTARD hub towers to skirt around that. Tower construction limitations make these complex repeater setups inevitable. Given how many existing sites are already set up in UISP like that for your network, let's accommodate them going forward. =)

dtaht commented 2 years ago

I think cake so saves your bacon on each hop here... but I imagine it is all nat hell?

thebracket commented 2 years ago

Not really NAT hell. There's a router at each site with links to other sites, with a "customer" port that provides connectivity to the customer. The routers relay DHCP requests from each router (adding option 82 data on the way) to ensure that whatever gets plugged into the customer's port receives the correct public IP.

I really should open source our "make option 82 work with UISP" setup, one day. In any client site, we setup an "other" device with a MAC address (equal to the port providing service's MAC), the name "Service IP" and the intended IP address as the device's address. A program periodically reads UISP and builds a DHCP configuration (ye olde isc-dhcpd) and hot-reloads it when it changes. Combine that with Bracket assigning queues to the customer and it's really seamless. Whatever the customer plugs in gets the right IP, and is shaped appropriately. There's even a small pool of IPs for each area into which "we've no record of you existing" devices get dumped (with short lease times) and redirect to a page reminding our installer to finish the process.

dtaht commented 2 years ago

What y'all do is so different than my second generation attempt in 2008. I wish I'd published it. I had had great pain in PPPoe in my first generation network, and said screw it, used static IPv6/48 as my underlying transport, allowed service or not based on the underlying radio MAC address, tunneled ipv4 under that, and split bandwidth up evenly (or so I thought) via SFQ. It was a minimum amount of service (5mbit) up to whatever was available, flat rate (well, I soaked the gringos and intended to subside the schools).

Was all you can eat, no complicated shaping needed. The cpe did their own dhcp for ipv4. Of course, no billing systems or decent shaping systems existed at the time either!

thebracket commented 2 years ago

In my testbed, commit https://github.com/thebracket/LibreQoS/commit/5b57b9a8017b111377fee88a42df6ffa091d227d contains a bit more work on this:

dtaht commented 1 year ago

@rchac @thebracket it looks like you have covered most of this. What haven't you covered?

thebracket commented 1 year ago

BracketQoS obsessively puts every single infrastructure device into the device list, shared as per-site "infrastructure" entries. It might be worth porting that. Otherwise, the current implementation is better than the original BracketQoS setup.

On Sun, Mar 19, 2023, 10:46 AM Dave Täht @.***> wrote:

@rchac https://github.com/rchac @thebracket https://github.com/thebracket it looks like you have covered most of this. What haven't you covered?

— Reply to this email directly, view it on GitHub https://github.com/LibreQoE/LibreQoS/issues/140#issuecomment-1475298462, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADRU436YREIMUPQPD46K2PTW44S3ZANCNFSM6AAAAAARLKVRNQ . You are receiving this because you were mentioned.Message ID: @.***>

thebracket commented 1 year ago

"Infrastructure" items (which may or may not be a good idea) and a good support-oriented long-term stats retention are the only remaining items on this. I don't think either is a 1.4 issue, changing the milestone.

bile0026 commented 1 year ago

+1 for "suspension" feature. Must have for my network.