Closed lucktu closed 2 years ago
A shot in the dark, are the supernodes' clocks somewhat in sync?
The reason for this is that edge1 only exist in sn2's management_port_out and not in sn1, and edge2/3/4 exist in sn1's management_port_out and not in sn2.
I think there’s some logic here that needs to be properly sorted out.
Discuss
When begin trying to connect, I think should always sn1 as their center unless sn1 has a problem, then can turn to federation for help (for example, for greater bandwidth).
When B joins in A's federation, A can see B (and it's communities, edges, and take advantage of it’s resources), but B only can see A's pending. when A joins in B's federation too, at this point, B can have full authority. At present, when A leak it's federation_name, or default, then maybe A is out of control (bandwidth allocation, too much useless information). ===== Or, Hide the parts of federation_name, and (out of A & B) others of the same_name_federation do not show, much less contact.
Do not display the ips and ports of other supernodes on edge's management_port_output.
The reason for this is that edge1 only exist in sn2's management_port_out and not in sn1, and edge2/3/4 exist in sn1's management_port_out and not in sn2.
With a view the management port output, this is expected behavior. Edges shall only connect to one supernode at a time.
The supernodes know about each other's edges anyway and forward packets to the corresponding supernode (#640) in case p2p not working.
Concerning the federation name, it actually is the private federation key. So, it needs to be handled confidentially as key material always deserve.
We assume the management port being available locally only to be accessible to legit users. Unless the management port is not password-protected which it is not, one would need to disable the port in the code if you fear leaking any information through this channel. Alternatively, we just remove the federation name from output.
Do not display the ips and ports of other supernodes on edge's management_port_output.
How else shall an administrator identify the other supernodes and check the federation for legit connections?
When begin trying to connect, I think should always sn1 as their center unless sn1 has a problem, then can turn to federation for help (for example, for greater bandwidth)
The federation feature was developed aiming at load balancing. However, an alternative already is available.
The selection criterion can easily be adapted to your needs and further schemes can be added as required. The supernode selection scheme code is contained at just one place (src/sn_selection.c
). Imagine, to have one fixed supernode and letting act the others in a hot-standby mode, it only would require to use the supernode's MAC address as selection criterion, i.e. connect to the supernode with the lowest MAC address. If that supernode were not available anymore, edges would connect to the one with the next-highest MAC address. Any volunteers?
1. Alternatively, we just remove the federation name from output.
2. How else shall an administrator identify the other supernodes and check the federation for legit connections?
You are right, I thought we could just show parts.
The federation feature was developed aiming at load balancing.
I think it’s an important backup supernode too.
Account for, only just add B to A's federation and using the default federation_name, now I can see "supernode.ntop.org:7777" on sn's management_port_out.
With a view the management port output, this is expected behavior. Edges shall only connect to one supernode at a time.
So what’s the solution to my problem?
I think it’s an important backup supernode too.
Actually, it serves as backup at the same time. Because load-balancing also means, that if one supernode is not available anymore, the edges look for another one.
So what’s the solution to my problem?
Not sure, I would expect your scenario be working easily. If you shared your logs with markers to the specific events and your exact command lines and IP / MAC addresses, maybe someone could volunteer to get deeper into it.
Did you check the system clocks at edges and supernodes?
I think it’s better if you experiment, and also look at the output of edge1, sn1 and sn2.
edge3 -d xrg -a 172.16.0.5 -c xtdg -k fhfghfg -l n2n.lucktu.com:10090 -t 31570 -Efr -e auto #
I can offer you a sn1: n2n.lucktu.com:10090
Did you check the system clocks at edges and supernodes?
They’re identical in pc1 / pc2 / pc5. Hope it’s okay if it’s different. I don't use -H.
Did you check the system clocks at edges and supernodes?
They’re identical in pc1 / pc2 / pc5. Hope it’s okay if it’s different.
Supernodes communicate with header encryption enabled, so they should be somewhat in sync +/- 16 seconds. If your edges also use header encryption (-H
), they should also be in sync with the other egdes and all supernodes at the same range.
I think it’s better if you experiment, ...
I can offer you a sn1: n2n.lucktu.com:10090
I am sorry that I currently am not able to offer this level of support.
... and also look at the output of edge1, sn1 and sn2.
Do you want to share those? Someone might jump into it.
... and also look at the output of edge1, sn1 and sn2
One more thought: If you inspect the management port output of edges and supernodes, please make sure that edges / supernodes do not connect using the local addresses (could happen due to name resolution configuration) but the public ones. You can ensure this by providing all -l ...
with public IP addresses – at the edges as well as at the supernodes, at least for testing.
I didn’t plan for you to jump in. ^_^
Yes, all -l ... with public ip addresses.
Might have something to do with the fact that sn3 was already running on pc2, even though it had nothing to do with sn1 & sn2?
Let’s put this question aside. you’d better test it yourself. the main problem is that edge1(Initiator) is easy to become someone else’s child and not easy to come back.
I’m more interested in #839 now!
I think that if DHT technology is introduced in the future, the "federation" is only a transitional function, and it is not worth the excessive effort in the "federation".
the "federation" is only a transitional function
Indeed. Moving more towards p2p does not need the federation concept anymore... well, it will look differently and will not be limited to the supernode side. We might need a master key signing every peer's key (consider it private, just like today's federation name) and a public key to verify, just like current -P
.
We obviously are on the same sheet of paper... :wink:
Given the extremely long list of (random) supernodes seen on your screenshot, I need to ask if all supernodes (and edges) are from the same and current dev code.
I need to ask if all supernodes (and edges) are from the same and current dev code.
I don’t recognize any of them except for the first supernode (223.133.68.170:10090), this is from sn2's management_port_out. those supernodes must be different versions from all over the world.
those supernodes must be different versions from all over the world
With a view to their IP addresses like 0.0.0.0
, 8.0.0.0
or 0.0.1.0
, it is very unlikely to see actual supernodes. I still assume that the supernodes are not built from the same source version, thus incompatible and hence this erroneous behavior. Can you check the supernode versions?
Can you check the supernode versions?
v.2.9.0.r999.b735ad6.
Let’s put this question aside please.
Is this still an issue?
I do not have the condition test at present, I shut down first, in the future, if it still exists, I’ll come back to it.
Now, edge1 loses contact with edge2/3/4 unless edge1 restart; if we kill the sn2, edge1 can contact with edge2/3/4, but when we restart sn2, it loses again.