Closed synctext closed 3 years ago
I don't know much about the channel stuff, but assuming that channel discovery is done by sending RemoteSelectPayload
messages in the GigaChannelCommunity
, we may want to think about when we are sending these messages. Right now we only query peers after we initiate the contact, not if somebody contacts us. This may slow down the channel discovery process. So, maybe we should also call send_remote_select_subscribed_channels
from introduction_request_callback
.
Disclaimer: Normally, you really do not want to hit maximum throughput. Filled UDP buffers lead to packet drop and broken overlay message flows. If we are bootstrapping our channels though, we don't really care about this packet drop. However, do note, you really want this filling out of your buffers to stop after a while, to resume normal control messaging and nicely load-balanced communication.
If you really want the maximum throughput you have to balance out the latency of (1) adding new peers for varied information and (2) actually receiving channels from the community for information volume. I.e., once you have a decently varied group of overlay neighbors, don't spend your valuable time and bandwidth on NAT-puncturing and keeping alive a new peer, spend it on pulling in channels. It is best to get connected to a small group of other peers and have them send their knowledge to us at a high rate. This translates to a pull-based mechanism, using a "please send me everything you have for the next X seconds"-message.
Note that IPv8 already takes care of introducing you to a decently sized pool of peers to communicate with by default (disclaimer # 2: what is "decent size" depends on the application, but we know from years of gossip reseach papers that more than 60 neighbors gives you almost no returns and 15 neighbors is about the start of the range you want to be in).
As mentioned by @egbertbouman , the current implementation asks any newly responding peer for a single channel this peer is subscribed to, once:
I agree with @egbertbouman and I suggest to instead use the aforementioned "please send me everything you have for the next X seconds"-message upon connection to a new peer instead of this very polite "please share me one of your subscribed channels"-message.
@qstokkink I assumed the num_entries
thing is there to prevent empty channels from being gossiped, which seems like a good idea. Database results, however, seem to be sent in 1 message/result. I think that makes sense as it just makes things easier, and it shouldn't affect channel discovery :shrug:
@egbertbouman I stand corrected on that then and I also agree that one result per message should be fine, as long as you send more messages if you want more results (or, as I suggested above, a single message that leads to multiple answers).
I do want to stress that the current implementation is very good about its bandwidth, its load balancing and has very readable and maintainable code. It's just not.. "mean" enough.
"please send me everything you have for the next X seconds"-message. If you really want the maximum throughput
Not maximum, just 25 UDP messages incoming would be ideal. I see too many security risks with, say, 4 MByte/seconds of incoming traffic (3000 packets per second!) . No only that congestion-control is not done yet (packet loss, buffer fill), we also have no trust function yet. Very helpful discussion.
Database results, however, seem to be sent in 1 message/result.
Perfect! Each new peer can only give us 1 entry. No compromise on security/spam. If we can talk to 25 new people per second and ask them for a random subscribed channel; very done! Can we reply to incoming new strangers as well (or we do that already..)?
My preference is to re-use the regular gossip mechanism. I don't like to use the generic remote query mechanism (that API is great for debug obviously). Can we re-use that existing code?
If we can talk to 25 new people per second and ask them for a random subscribed channel; very done!
Well, I mean, we can. But, probably we shouldn't: see https://ieeexplore.ieee.org/document/6934311. Based on past both experience and science from our lab, I would suggest depending more on information propagation than peer churn.
I just noticed that the gossip cache logic mentioned by @synctext is triggered by send_random_to
, which in turn gets called from SyncChannels
. However, SyncChannels
is not loaded by the IPv8CommunityLauncher
. This strategy was disabled in https://github.com/Tribler/tribler/commit/b2bb9d636c3ad6925c8c6dc0f16ab939727a2994. @ichorid Is this expected?
Database results, however, seem to be sent in 1 message/result.
Correction: it will send you all the results of the remote query, broken down into as many (lz4-compressed) UDP packets as necessary. There is a hard limit on the sender side to prevent abuse: never send more than 50 entries.
For example, if a remote query results in N entries (channels) to be sent, these N entries will be sent as 1-N packets, depending on how many of these fit into a single packet. The sender side will proceed to pack the response entries into a single packet until it fills up to capacity, then it will send the packet and proceed to pack the rest of the unsent entries into the next packet, etc.
So, it is pretty efficient as it is.
This strategy was disabled in b2bb9d6. @ichorid Is this expected?
Oops...
Actually, the refactored GigaChannelCommunity is now based on RemoteQueryCommunity and should use RemovePeers
strategy to constantly walk and query new peers. It uses it, according to the modules catalog. Also, I checked it by hand - the strategy is indeed used.
My preference is to re-use the regular gossip mechanism. I don't like to use the generic remote query mechanism (that API is great for debug obviously). Can we re-use that existing code?
Of course, but I would say we should make it pull-based instead, so the querying side can control the rate.
Also, the previous (push-based) logic packed preview content into the same packet as the gossiped channel. That's inefficient. RemoteQuery features the query-back subsystem which can query as much preview content as needed, up to any channel depth. If you're concerned with efficiency, the better way to solve it is to just cache the query results on the sender side.
@ichorid in that case, it seems to me like all the technical infrastructure is in place and we just need to tweak some parameters to be more aggressive in order to fulfill @synctext's wishes. Is that correct?
@ichorid in that case, it seems to me like all the technical infrastructure is in place and we just need to tweak some parameters to be more aggressive in order to fulfill @synctext's wishes. Is that correct?
Yes. However, I do not know IPv8 good enough to even imagine what kind of tweaks should be done to e.g. increase the walk speed 10 times.
That one is easy, divide this value by 10 (from 0.5 -> 0.05):
If you make the value too low, Tribler will detect main thread congestion and auto-scale back anyway.
That one is easy, divide this value by 10 (from 0.5 -> 0.05):
If you make the value too low, Tribler will detect main thread congestion and auto-scale back anyway.
Is it possible to set walk_interval
for a single Community?
IPv8 was not made to have varying walk intervals per Community (it smooths out network traffic using the fixed value - which is really cool, I do recommend checking out your network traffic in the system monitor 😃).
Perhaps it is indeed better to keep your own looping TaskManager task in your particular Community. This gives you the most freedom to control the intervals.
That said, you could also set the walk_interval
to an ultra-low value. I'll leave this memo here, if you opt for that solution:
Perhaps it is indeed better to keep your own looping TaskManager task in your particular Community. This gives you the most freedom to control the intervals.
So, I should just create a task to trigger the walks by myself, right?
Yes, something like this (if you want to get really fancy you could even modulate your own interval on the fly - I won't show that here):
And then, in the task itself, fire off the "pull" messages to (random?) peers from self.get_peers()
.
And then, in the task itself, fire off the "pull" messages to (random?) peers from self.get_peers().
But that is not a "walk", right? e.g. this will not speed up the discovery of new peers.
Because now we query each introduced peer on introduction. So, there is no reason to query them again. Instead, we should get more introductions.
All the IPv8
object does, essentially, is call DiscoveryStrategy.take_step()
periodically. If you take this into your own control you can call RandomWalk(my_community).take_step()
and RemovePeers(my_community).take_step()
to your heart's content in your own periodic task - you don't even have to register these two DiscoveryStrategy
instances with IPv8.
Disclaimer. I'd still prefer depending on information propagation instead. Forcing 50000 users to hammer the bootstrap nodes is one of the few ways you can actually take our entire network down. If we do switch to "open" bootstrap nodes, this is also a good way to get IP banned.
Connect using IPv8 to 200 unique peers and obtain from each of them 1 random subscribed channel (non-empty). Do this at a rate of roughly 20 UDP messages/second with signed votes per second. No invasive change. Would fill up screen fast and safe incremental step forward for Tribler.
Instead, we should get more introductions.
Yes :+1:
Yes. However, I do not know IPv8 good enough to even imagine what kind of tweaks should be done to e.g. increase the walk speed 10 times.
This was very helpful to clarify what I believe users will like.
Connect using IPv8 to 200 unique peers
Bump up this value from 30 to 200.
@walk_strategy(random_walk, target_peers=30)
Can this be changed dynamically? So first, 20 seconds use target_peers=200
then trottle back to target_peers=30
.
Can this be changed dynamically? So first, 20 seconds use target_peers=200 then trottle back to target_peers=30.
Yes, the script becomes about 3 lines though. I'll help with that when the time comes.
Let only speedup the discovery process in 1 community for 10 seconds. How would you advice us to bypass the 500ms take_step()
delay down to 50ms to reach target_peers=200
faster? (avoiding damage to bootstrap nodes).
Here's the complete example (given https://github.com/Tribler/tribler/issues/5828#issuecomment-743213137 and excluding imports):
class TheCommunityYouWishToAddThisTo(Community):
def __init__(self, etc):
# Initialize some stuff
# Example starts here:
self.start_speed_walk = time.time()
self.walker = RandomWalk(self, timeout=10.0, window_size=200)
self.register_task("speed walking", self.speed_walk, interval=0.05)
def speed_walk(self):
self.walker.take_step()
if time.time() - self.start_speed_walk > 10.0: # Stop after 10 seconds.
self.cancel_pending_task("speed walking")
Forcing 50000 users to hammer the bootstrap nodes is one of the few ways you can actually take our entire network down.
I always thought a community only gets the first introduction from the bootstrap nodes, and the rest of the addresses/punctures are introduced directly by other peers?
I always thought a community only gets the first introduction from the bootstrap nodes, and the rest of the addresses/punctures are introduced directly by other peers?
No, that causes severe network partitioning/clustering. For example, in practice we experienced situations where 4 local nodes would form 2 groups of 2 (causing silly stuff). You need to periodically bootstrap to remain connected to a single cluster (aka a robust network topology).
Regarding contacting bootstrap nodes, there are two extremes you don't want to be in:
Well, we have a special case for these "fast walks" case: we don't care about connectivity too much. E.g. can we disable bootstrap-node-querying when calling take_step
manually?
Yes, in that case you can do something like this in the speed_walk()
method:
if len(self.get_peers()) > 20: # Stop walking to bootstrap nodes after you have 20 peers.
self.walker.reset_chance = 0
I tried this, kinda works... Next, we must fine-tune the parameters of the walking process. Also, we need to decide if we want to send all the channels, not just the subscribed ones. Now, this seems to look suspiciously like the PopularityCommunity problem... @xoriole , @kozlovsky , @drew2a , may I kindly ask you guys to take over this task? You already got an infrastructure for working with this kind of experiments. Also, this one is easier than the PopCom, as the dataset is so much smaller, and we're in full control of the DB.
(p.s. There is a still a bug with the GUI not showing updates sometimes, and a problem with dynamically sorting stuff, I will solve these of course.)
Quick update:
I've ran an experiment with the suggested tweaks, and this is the raw results.
The experiment lasted one minute.
Below are presented two runs:
In both runs two values were measured
len(self.get_peers())
introduction_response_callback
calls.Tweaks implemented:
class ChannelDiscoveryBoosterMixin(RemoteQueryCommunity):
TIMEOUT_IN_SEC = 60.0
MAX_PEERS = 200
# Stop walking to bootstrap nodes after you have this number of peers.
RETURN_TO_BOOTSTRAP_THRESHOLD = 20
TAKE_STEP_INTERVAL_IN_SEC = 0.05
def __init__(self, *args, **kwargs):
super(ChannelDiscoveryBoosterMixin, self).__init__(*args, **kwargs)
self.logger.info(f'Init. Timeout: {ChannelDiscoveryBoosterMixin.TIMEOUT_IN_SEC}s, '
f'Max peers: {ChannelDiscoveryBoosterMixin.MAX_PEERS}, '
f'Take step interval: {ChannelDiscoveryBoosterMixin.TAKE_STEP_INTERVAL_IN_SEC}s')
self.walker = RandomWalk(self, timeout=ChannelDiscoveryBoosterMixin.TIMEOUT_IN_SEC,
window_size=ChannelDiscoveryBoosterMixin.MAX_PEERS)
self.register_task('take step', self.speed_walk,
interval=ChannelDiscoveryBoosterMixin.TAKE_STEP_INTERVAL_IN_SEC)
def speed_walk(self):
self.logger.info('Take a step')
self.walker.take_step()
# Stop walking to bootstrap nodes after you have N peers.
if len(self.get_peers()) > ChannelDiscoveryBoosterMixin.RETURN_TO_BOOTSTRAP_THRESHOLD:
self.logger.info('Stop returns to bootstrap')
self.walker.reset_chance = 0
Quick update.
The data from the experiment with implemented almost all requested changes.
The experiment duration: 1 min Boosted period: 30 sec
The remote query used:
self.send_remote_select(peer=peer, metadata_type=CHANNEL_TORRENT, subscribed=True,
attribute_ranges=(("num_entries", 1, None),), last=1)
The count query used:
channels_count = count(ts for ts in self.mds.ChannelNode if ts.metadata_type == CHANNEL_TORRENT)
self.send_remote_select(peer=peer, metadata_type=CHANNEL_TORRENT, subscribed=True,
attribute_ranges=(("num_entries", 1, None),), last=1)
Well, you're missing the opportunity to put more than one channel entry into a single UDP packet. Typically, a single channel entry takes about 250 bytes. So, if you change it to last=3
, it is practically guaranteed that the response packet will contain at least 3 channel entries. In general, if all the queried items won't fit into a single UDP packet, these will be split into as many packets as necessary (packing as many entries into each packet as possible).
Also, keep in mind that more queries = more load on the network. So, I don't see any reason to include last=1
at all. If you omit the last
argument, the query will try to send every subscribed channel, up to a safety limit on the responder side (that's about 50 entries IIRC).
@ichorid thank you for the comment, I keep it in mind.
The reason, why I publish charts and PR is to get a feedback as earlier as possible.
Why did I use last=1
. Because it gives us a simple picture to discuss. Not because it is the most efficient way to obtain more channels.
@ichorid by the way, setting last
to last=3
does'n really affect overall count of received channels.
My guess is that the ordinary user haven't subscribed channels at all.
@drew2a could you include subscribed=False
with different levels of last
in your benchmark?
@qstokkink yes, sure.
I have a question here. From run to run I see that "peers count" get stuck in range 30-60. Do you know why? My expectation: this count should be higher.
It's because the RandomWalk strategy picks a random peer from your known peers to get an introduction from. Basically, the more peers you get, the higher the chance you get a peer you already knew. Essentially, you need to do an exponential amount of work for every new peer. You could experiment with the EdgeWalk strategy to overcome this.
@qstokkink thank you, that was my main assumption... what arguments do you suggest to use in EdgeWalk?
edge_length, neighborhood_size, edge_timeout
Benchmark update: Experiment duration: 60s Boost period: 30s Last: 50
I've never actually tried to prioritize getting a substantial amount of peers as quickly as possible. Intuitively, I'd start with a neighborhood size of 30 and an edge length of 10, edge timeout can probably stay at the default. You'd have to experiment a bit to find the optimal values.
Also, regarding the recent result, omitting subscribed
leading to many more results is quite surprising to me. What in the world is going on there? Was that a lucky run? Is the remote select broken? I would expect the omission to be roughly equal to subscribed=False
+ subscribed=True
.
Also, regarding the recent result, omitting subscribed leading to many more results is quite surprising to me.
Yes, it is strange.
Of course, I could have made a mistake in the query. To minimize the chance of this happening, I published a query in this issue, and opened a PR (in hopes that it would be seen).
@ichorid could you verify the query?
Solid progress! Great measure&exploration work. Assuming these 60 seconds measurements are with the live network? For bootstrapping we want to fill the screen and amaze the user. Feel free to leave our subscribe-filtering for bootstrap.
Assuming these 60 seconds measurements are with the live network?
Yes
@ichorid could you verify the query?
Congratulations, sir, you found a bug! :tada: :bug: :tada: The bug is: https://github.com/Tribler/tribler/blob/b6774caf569ef71dfa078b0355e63aea5f69ea32/src/tribler-core/tribler_core/modules/metadata_store/orm_bindings/metadata_node.py#L109
you guess it should be lambda g: g.subscribed==subscribed
instead.
The result of the buggy current version is:
subscribed
is None
, it will query for everythingsubscribed
is True
it will query for subscribed channelssubscribed
is False
it will still query for subscribed channelsAlso, be warned that if you're going to get all the channels (and not just the subscribed ones), you lose the advantage of filtering junk. Consequently, you lose the information necessary to build the beautiful VSIDS-based channels rating.
@ichorid I understood everything, except:
you lose the information necessary to build the beautiful VSIDS-based channels rating
you lose the information necessary to build the beautiful VSIDS-based channels rating
When you query other peers for subscribed channels, there is a callback on receiving new entries, that bumps (add votes to) the received channels:
In a sense, the voting system uses pull-based gossip as a "cue" about what is popular. There is no explicit "peer X voted for channel Y" message.
If you are going to query remote peers for non-subscribed channels, you have to disable this callback. Otherwise, all of the received channel entries will be bumped (i.e. marked as voted by the peer who sent you these). This will result in breaking the distributed algorithm for voting, which is:
If there is no difference between "subscribed" and "not subscribed" channels, there is no difference between "voted" and "non-voted" channels. In that case, every channels' rating will eventually become (almost) the same.
To circumvent this problem without changing the packet format, or introducing different community, or doing Bloom filter, whatever, the simplest thing is to simultaneously query the same host for both all the channels (w/o vote-bumping callback), and for subscribed channels only (with vote-bumping callback).
This will still result in the propagation of junk, but at least it will keep the voting system working, which is the only thing that saves the user from being completely overwhelmed by junk.
@ichorid thank you for the explanation. It is clear now ✊.
@qstokkink regarding EdgeWalk
...
I made an experiment today in which I tried to estimate neighborhood size
and edge length
for maximazing peers count for 30s interval.
Pseudocode of the experiment is the following:
for neighborhood_size in [1, 25, 50, 75, 100]:
for edge_length in [1, 12, 25, 37, 50]:
for repeat in range(5):
calculate_peers_count(neighborhood_size, edge_length, timeout=30)
Surprisingly, the best (average) values are:
The row data in csv format: edge_walk.csv.zip
And this is a visualisation (but, unfortunately, it's not really obvious)
@drew2a Nice graph 👍
From your results it seems 25/25 is the most balanced choice. We want the neighborhood size as small as possible to make sure we don't depend too much on the bootstrapping mechanism (to hook in external bootstrap nodes we need to cure our performant bootstrap node addiction 😃).
Goal: accelerate the genesis discovery of channels.
The experience we give to our user is that when Tribler starts for the first time your screen is just bombarded with newly discovered channels. It goes very fast, barely able to read all the information Tribler is reading of The Internet. After some seconds the most popular channels start to float to the top. Re-sorting based on popularity is done every second or so. Within 30 seconds the most popular channels on Tribler are stable and fill the page. The user is able to scroll to all these new channels.
Out of scope: fast pre-preview of when clocking on a new channel. Also out of scope: fast start of download of channel after subscribe.
By walking multiple times within IPv8 we can attempt to discover 25 new channels per second for first 30 seconds of Tribler startup. The introduction of the first peer is usually 600ms, after that we aim to discover channel entries at full speed. https://jenkins-ci.tribler.org/job/ipv8/job/performance_peer_discovery/plot/