Open awlx opened 3 years ago
Freifunk Braunschweig has been working on such net for some time now. Codename is Parker (not BATMAN, but still something with nets): https://freifunk-bs.de/parker.html Our small test-network is online and works like a charm: https://freifunk-bs.de/parker.html
We are currently working on making the migration path form classic gluon to parker smooth. Afterwards we want to migrate our domain to that new technology.
We have talked a bit about it back in 2018: https://stratum0.org/blog/posts/2018/11/22/freifunk-parker/
If you are interested we would like to share our results, architecture and code with you if you are interested. One option would be our weekly meetup (Wednesdays, 19:00, see freifunk-bs.de) or let' schedule a conference :)
Cheers Chris
Hi @SmithChart that sounds great :). I already read about your project in the past.
Nice and interesting post! (I hope it is okay, that I am posting here)
IP address conflicts while roaming
I have been wondering for some time how relevant this is....
@NeoRaider always emphasized that roaming from any node in the network to any other node in the network must be possible. I don't remember the reasons and his explanations in detail.
I for one wonder if this is even a relevant case. For now, I would guess that devices are only roaming within the local mesh cloud in most cases. But this is only a gut feeling, since I haven't seen any evaluation of the probability of such "long distance" roaming behavior occurring yet. Perhaps @T-X knows of such evaluations? I think we discussed some time ago that such evaluations would be nice (in context of the batman translation table crc bug and batadv-scapy or something?).
However, I would love to see this and am looking forward to hear your reports.
Cheers lemoer
At least for our (legacy) network we usually do not see meshes that do not see each other but allow clients to roam between the SSIDs. Our net just is not dense enough.
As long as both meshes have at least one mesh-link they are a single BATMAN-domain and thus traffic will be routed to the correct router. Especiallay for IPv6 the client will see advertisements from the next close rotuer (since we use radvfilterd to align the BATMAN ipv4-filter with the ipv6-filters) and traffic will gradually shift to the closer router.
Our last part of the puzzle are really short lease times and frequent RAs.
That's also what we see here, if there is no mesh roaming basically does not happen anyway. As by the time the client arrives at the other AP it needs to request a new lease (which is 5min in our case) anyway. So this is not really a concern of us.
Also some people have risen the argument of cross client traffic, which basically is also never observed in our meshes or even domains. Most people just want to go the internet or connect to a spotify device right next to their current place, which will still be possible as one mesh shares the broadcast domain.
IP address conflicts while roaming
Regarding this point and adding to @lemoer's comment. IEEE 802.11-2016 defines an ESSID as:
[...] The key concept is that the ESS appears the same to an LLC layer as an IBSS. STAs within an ESS can communicate and mobile STAs might move from one BSS to another (within the same ESS) transparently to LLC. [...]
Or on Wikipedia:
Extended service set: [...] It is a set of one or more infrastructure basic service sets on a common logical network segment (i.e. same IP subnet and VLAN).
https://en.wikipedia.org/wiki/Service_set_(802.11_network)#Extended_service_set
So in practice, a wpa-supplicant for instance will roam from one AP to another with the same ESSID as it thinks fits (e.g. by signal strength). But wpa-supplicant will not re-negotiate DHCP upon roaming, as by definition of an ESSID it assumes it is still the same broadcast domain.
@awlx
That's also what we see here, if there is no mesh roaming basically does not happen anyway.
One counter example is an intersection of two streets:
(a)
======= cl =======
||
||
||
|| (b)
You might have a node in one street and another one in a crossing street (a+b). These two do not see and therefore do not mesh with each other as there are big buildings in between. However a user (cl) with a mobile device can see both APs when it is at the intersection.
And the client device will be forced to roam when moving from one street into the other and by that moving out of sight of the AP it was initially connected to.
The question is how often this is really the case. But that is probably very different depending on the community.
Is it possible (and how complex would it be) to signal the premature end of the lifetime to a client if it tries to do something with a wrong address?
Is it possible (and how complex would it be) to signal the premature end of the lifetime to a client if it tries to do something with a wrong address?
For IPv6 sending unsolicited Router Advertisements with a reduced (zero?) lifetime or adjusted default router preference (RFC4191) should be possible. For DHCP there seems to be a "DHCP reconfigure extension" (RFC3203) but not sure how widely it is implemented on the client side. Also that in turn seems to require "Authentication for DHCP Messages" (RFC3118) which might make this a bit more complicated in a distributed, multi-party network.
In theory there is/was also 802.11f (Inter-Access Point Protocol). But it seems it was withdrawn in 2006? (unless I'm reading the Wikipedia and IEEE timeline wrong)
https://tools.ietf.org/html/rfc4191 https://tools.ietf.org/html/rfc3203 https://tools.ietf.org/html/rfc3118 https://en.wikipedia.org/wiki/Inter-Access_Point_Protocol https://grouper.ieee.org/groups/802/11/Reports/802.11_Timelines.htm
Shouldn't it also be possible for an AP to terminate the Wifi connection, to force the client to reconnect and send a new DHCP request? As far I understand there is a Disassociation Frame as well as a Deauthentication Frame available in 802.11 to achieve this.
https://mrncciew.com/2014/10/11/802-11-mgmt-deauth-disassociation-frames/
You need to analyse the traffic of the client then and watch out for wrong leases and stuff.
For v4, why do you want to make NAT and DHCP on the node? Why not keep it on the gateway?
To sum up some discussion on chat.ffmuc.net on this topic:
Both NAT66 and rotating prefixes are a bad idea to run own services on the internet, which then involve dyndns and during rollover stuff could break. I won't recommend the use of that as it's a step backwards and not forward.
I am also not sure what the implication is of saying "a node with key x and address space y was connected" as it's clearly used for Freifunk and nobody knows who was connected. But that's something a lawyer has to decide.
Technically the more NAT the more broken the network and we should treat IPv6 as a first class citizen and not break it just as much as IPv4.
I don't get the "run own services" argument - we are talking about public hotspots to provide internet access, I would not expect and not want anyone to offer services over my Freifunk node.
For v4, why do you want to make NAT and DHCP on the node? Why not keep it on the gateway?
If we NAT on the gateway we need routable prefixes to the Client network ... which means we need to hand out big enough networks and stuff.
I don't get the "run own services" argument - we are talking about public hotspots to provide internet access, I would not expect and not want anyone to offer services over my Freifunk node.
Freifunk at a time was meant to enable such stuff, as you can just provide things and everyone can connect that's part of the original idea.
I don't get the "run own services" argument - we are talking about public hotspots to provide internet access, I would not expect and not want anyone to offer services over my Freifunk node.
Freifunk at a time was meant to enable such stuff, as you can just provide things and everyone can connect that's part of the original idea.
I think we should differentiate between people actively taking part in the Freifunk network (node owners) who should be able to do so, and random anonymous WiFi users, where I don't see any benefit, but a lot of risk, if they get stable IP addresses and can host stuff on the network.
If someone wants to avoid all risk they shouldn't choose to become part of the Freifunk network. We are not here to enable illegal activity, for that other services exist.
What we want to achieve is to protect people who provide access to the network (node owners) from being sued for copyright infringement and other stuff the users did. And this is still the case, it doesn't mean we protect everyone connecting to the network from this.
What we want to achieve is to protect people who provide access to the network (node owners) from being sued for copyright infringement, which they didn't do. And this is still the case.
I doubt that this is still the case, if a copyright infringement can easily be traced back to the originating node.
But it doesn't say that the person who runs the node did it. As they just provide an open access network. But as said, that's for a lawyer to decide.
Why not leave the decision to the person running the node, whether he wants a stable /64 or not?
Because that will introduce much overhead for us, on which basis should the /64 be chosen? How long does the node keep it? Our server still has the logs who got which network.
It's also possible to just log the B.A.T.M.A.N. claimtable over time and ask for mac addresses. Those nexthops are know to us at any time, because that's how it all works. Then we can also hand out the IP address of the none owner ... because it's clear which one it is.
It's also possible to just log the B.A.T.M.A.N. claimtable over time and ask for mac addresses. Those nexthops are know to us at any time, because that's how it all works. Then we can also hand out the IP address of the none owner ... because it's clear which one it is.
But we don't do this kind of logging - for a good reason.
We don't but do we know who does it? No we don't. The layer2 is more risky than anything else traffic can also just be redirected through other nodes without us even noticing.
I fully agree that we should switch from the giant L2 to a routed approach and use BATMAN only where it belongs, in the local WiFi meshes. All I am asking for is that nodes should be able to decide how often they request a new /64 from the gateway, instead of getting a fixed one.
But which pool? For how long do we mark /64 as stale and not usable for others?
I don't see any benefit here only operational overhead, looking at Freifunk Franken who also do fixed /64 and it works. @fblaese
Also we then have to do the same for IPv4, as any WebRTC call will leak all addresses ... the NAT address of the Node as well as external NAT, as well as internal Pool.
Oh, I see a clear benefit, which is protection of node owners from liability for things third parties are doing using their node for internet access.
Should make 0 difference as said ... it's no trackable as well.
But before ... we de-rail this thread even more from technical standpoints to only meta discussions, we should first try if this approach would even work ... maybe it makes no sense at all from a technical stand-point and this whole discussion was unnecessary.
So best thing would be if someone tests the technical aspects and proofs that the idea is possible.
Maybe a good thing for @ce-4, @lqb and @goligo to play with ... as it's a good chance to learn and some want to get rid of B.A.T.M.A.N. traffic on the Unifi controller. Also this will lead to a deep understanding of Gluon, Gateways and B.A.T.M.A.N.
I don't see any benefit here only operational overhead, looking at Freifunk Franken who also do fixed /64 and it works. @fblaese
Up until now we haven't had any issues with static address assignments. For conveniance, we assign IPv6 prefixes anonymously (see https://sub.f3netze.de), which are then announced in our babel network by router opterators. There is no guarantee that a prefix, that is currently announced by a router, always has been located there. We also do not log annoucments (anybody could, though), so from a liability standpoint this should mostly be equivalent to to B.A.T.M.A.N. advanced networks.
@awlx lets meet.ffmuc.net . I want to be sure zu Talk about the same Thing.
We can discuss this in https://chat.ffmuc.net/freifunk/channels/noc. And should work async on this.
unser super-repo ist nun auf stand: https://gitli.stratum0.org/ffbs/ffbs-gluon hier unsere site: https://gitli.stratum0.org/ffbs/ffbs-site zwischenstand packages ist hier: https://github.com/SmithChart/community-packages/tree/topic/parker und der aktuelle gluon-parker base-stand: https://github.com/ffbs/gluon-parker/
in-Person Treffen wird am 19./20. Oktober stattfinden. FFMUC Parker Firmware: https://github.com/freifunkMUC/site-ffm/tree/parker FFBA im Mumble http://telmir.stratum0.org/ Meeting Notes: https://pad.stratum0.org/p/freifunk_20240717_parker
Very nice to see the progress and collaboration on this! Some sort of clustering is definitely a great way to increase scalability. Four things I would be very interested in (does not need to be answered / discussed here, but I would love to read more about them in some meeting notes, FAQs or test results in the future)
1) Any plans to integrate DDHCP maybe? Would that allow use smaller IPv4 prefixes for ffmuc? (the pad says ffmuc would need a /10) 2) What happens if the two nodes with an uplink only have a (temporarily?) bad WiFi mesh connection? Or even if they had for instance a stable 1 MBit/s throughput over WiFi mesh, the WiFi would then still always be preferred, even if there were in theory a 1 GBit/s fiber available over a mesh-vpn? (I'm wondering if it could make sense to have batman-adv not between all nodes (of a domain), but between these uplink nodes that share the same WiFi mesh at least?) 3) Has anything changed with the roaming situation on modern cellphone operating systems? (Maybe Android and iPhones got more clever regarding when to get a new IP address? Does anyone have any current experience with the roaming behaviour between APs with the same ESSID but differing IP ranges?) 4) This (for now) seems to be a bit incompatible/divergent with the multicast related progress? Though maybe it wouldn't be that difficult to integrate/add later. Maybe just adding pim6sd and enabling it on uplink nodes (or just one uplink node in the local mesh, to avoid redundant multicast streams due to RFC4541 to each uplink node?) would mostly be enough. (I know, this might be a bit "opinionated" topic and I agree that especially if a protocol uses multicast just like broadcast, that it does not scale well. And will need more field testing. But I think generally there was a lot of progress on this in the last 10 years: More vendors understanding/implementing RFC4541, Linux bridge multicast snooping finally works after ~4 years+ of bugfixing, it seems Android has finally fixed their MLD firewall bug and there is a workaround for it in Gluon, there are now Forward-Error-Correction / RaptorQ RFCs for RTP with support in gstreamer, batman-adv now uses IGMP/MLD snooping and has a new multicast packet type, low MLD "noise" implementation in Gluon, routeable multicast support in batman-adv (+Gluon, upstreaming WIP). And I still love this concept in general to avoid needing big, central content servers, to enable "the small people" too, to reach many people, without needing to pay directly or via ads to the big servers :-) .)
This is a draft idea how we could switch FFMUC to a routed approach without losing functionality of B.A.T.M.A.N. for local meshes.
Problem statement
We want to switch Freifunk Munich to a routed approach towards the gateways, because large layer2 domains pose too many problems. Also we want to get rid of the overhead of VXLAN and B.A.T.M.A.N. towards the gateways.
Idea
Use wireguard to connect to the Freifunk Munich gateways
Inside wireguard use a calculated link-local address which is derived from the public key
v6
v4
Meshing
Why not babel?
What needs to be done?
Possible issues
Known Issues
Glossar
Discussion
https://chat.ffmuc.net/freifunk/channels/firmware
Comments welcome! 🚀