elonen / lanscatter

Efficient large files distribution app for Local Area Networks
Apache License 2.0
1 stars 1 forks source link

Implementing switch group optimization #2

Open elonen opened 4 years ago

elonen commented 4 years ago

Idea for reducing traffic on network bottlenecks like ethernet switches (, VPNs, routers, firewalls):

Add a user-configurable string called peer group, and make planner favor peer nodes in the same group. The optimization might be significant if there are many peers and large switches.

For routers this could be autodetected (by traceroute,) but for switches a manually configured group name seems to the only way.

The group name could even be hierarchical, domain name style (e.g. room-b.office-west.building1) to reduce congestion on backbone switches by further minimizing hops when possible.

elonen commented 4 years ago

Turns out autodetecting switch topology would be possible, using LLDP over SNMP. Perhaps with https://github.com/alessandromaggio/quicksnmp or the lower level https://github.com/etingof/pysnmp. Notes:

FOR ENDPOINT / WORKSTATION DISCOVERY:
------------------------------------

These can be used to build a tree (graph) of interconnected switches:

LLDP subtree:
1.0.8802.1.1.2.1.4.1.1
(e.g. snmpwalk -v 2c -c public 192.168.0.4  1.0.8802.1.1.2.1.4.1.1.7)

lldpLocalSystemData:
1.0.8802.1.1.2.1.3.1 (lldpLocChassisIdSubtype)
1.0.8802.1.1.2.1.3.2 (lldpLocChassisId)             = MAC
1.0.8802.1.1.2.1.3.3 (lldpLocSysName)               = user-supplied name
1.0.8802.1.1.2.1.3.4 (lldpLocSysDesc)               = vendor-supplied name (model etc)
1.0.8802.1.1.2.1.3.5 (lldpLocSysCapSupported)
1.0.8802.1.1.2.1.3.6 (lldpLocSysCapEnabled)
1.0.8802.1.1.2.1.3.7 (lldpLocPortTable)
1.0.8802.1.1.2.1.3.8 (lldpLocManAddrTable)

lldpRemEntry:
1.0.8802.1.1.2.1.4.1.1.1 (lldpRemTimeMark)
1.0.8802.1.1.2.1.4.1.1.2 (lldpRemLocalPortNum)
1.0.8802.1.1.2.1.4.1.1.3 (lldpRemIndex)
1.0.8802.1.1.2.1.4.1.1.4 (lldpRemChassisIdSubtype)
1.0.8802.1.1.2.1.4.1.1.5 (lldpRemChassisId)          = MAC
1.0.8802.1.1.2.1.4.1.1.6 (lldpRemPortIdSubtype)
1.0.8802.1.1.2.1.4.1.1.7 (lldpRemPortId)
1.0.8802.1.1.2.1.4.1.1.8 (lldpRemPortDesc)
1.0.8802.1.1.2.1.4.1.1.9 (lldpRemSysName)
1.0.8802.1.1.2.1.4.1.1.10 (lldpRemSysDesc)
1.0.8802.1.1.2.1.4.1.1.11 (lldpRemSysCapSupported)
1.0.8802.1.1.2.1.4.1.1.12 (lldpRemSysCapEnabled)

FOR ENDPOINT / WORKSTATION DISCOVERY:
------------------------------------
(also see https://www.dei.isep.ipp.pt/~asc/tiny-papers/snmp-mib.pdf)

switch interface names:
1.3.6.1.2.1.31.1.1.1.1 

ARP
macs: 1.3.6.1.2.1.3.1.1.2 (last digits of key=IP, value = MAC)

map VLAN + MAC to switch port:
1.3.6.1.2.1.17.7.1.2.2.1.2 (vlan.mac in key = port number)

Physical port type:
1.3.6.1.2.1.26.2.1.1.3 (values are documented in dot3MauType specs, 1.3.6.1.2.1.26.4)

Algorithm to figure out which workstation IP is connected to which port
and IP is connected to would look something like this:

1) Recursively map out switches
2) Map IPs to MACs (either arp -a or use 1.3.6.1.2.1.3.1.1.2)
3) Lookup port numbers from VLAN + MAC on every switch (1.3.6.1.2.1.17.7.1.2.2.1.2)
4) Check the port against type and/or name on every switch (1.3.6.1.2.1.26.2.1.1.3 and 1.3.6.1.2.1.31.1.1.1.1)
5) If any of the switch MACs are found on that port, it's an inter-switch port. Otherwise, it's the endpoint for the workstation / peer.
elonen commented 4 years ago

If switch hierarchy autodetection was implemented, it would probably be best to skip manual configuration by peer node users, and instead make master node read a list of MAC->groupname pairs from a file. Something like:

a1:5a:35:8e:16:d5  sw203.sw210.swroot
12:d4:5f:eb:eb:01  sw203.sw210.swroot
37:03:0c:f8:9b:3a  sw204.sw210.swroot
df:2e:01:d9:81:86  sw103.sw110.sw210.swroot
d1:66:29:36:69:c9  sw102.sw110.sw210.swroot

...where switch names could be human readable like above, or maybe just sanitized switch ids straight from lldpRemChassisId.

elonen commented 4 years ago

Simple switch groups may not be enough in all cases. To properly avoid congestion, master should in fact build an acyclic graph of ethernet/switch topology, assign maximum capacity to each link and refrain from planning new downloads when the path between uploading and download peers would overload.

elonen commented 4 years ago

Commit e1651ac now makes planner take network link bandwidth limits (if provided) into account, and also implements master driven bandwidth limits in peer's chunk download code.

To make Ethernet switch group optimization fully functional, Lanscatter still needs a way to input network topology to master node (i.e. need to implement a user-configurable planner.LinkMapper class in masternode.py instead of the current dummy one).

Repository https://github.com/elonen/switch-mapper contains a tool that automatically extracts necessary information from LLDP/SNMP-enabled switches, but the output needs to be processed/simplified before feeding to master node, and the topology data should be manually configurable as well.