Fast channel discovery after first startup

synctext commented 3 years ago

Goal: accelerate the genesis discovery of channels.

The experience we give to our user is that when Tribler starts for the first time your screen is just bombarded with newly discovered channels. It goes very fast, barely able to read all the information Tribler is reading of The Internet. After some seconds the most popular channels start to float to the top. Re-sorting based on popularity is done every second or so. Within 30 seconds the most popular channels on Tribler are stable and fill the page. The user is able to scroll to all these new channels.

Out of scope: fast pre-preview of when clocking on a new channel. Also out of scope: fast start of download of channel after subscribe.

By walking multiple times within IPv8 we can attempt to discover 25 new channels per second for first 30 seconds of Tribler startup. The introduction of the first peer is usually 600ms, after that we aim to discover channel entries at full speed. https://jenkins-ci.tribler.org/job/ipv8/job/performance_peer_discovery/plot/

egbertbouman commented 3 years ago

I don't know much about the channel stuff, but assuming that channel discovery is done by sending RemoteSelectPayload messages in the GigaChannelCommunity, we may want to think about when we are sending these messages. Right now we only query peers after we initiate the contact, not if somebody contacts us. This may slow down the channel discovery process. So, maybe we should also call send_remote_select_subscribed_channels from introduction_request_callback.

qstokkink commented 3 years ago

Disclaimer: Normally, you really do not want to hit maximum throughput. Filled UDP buffers lead to packet drop and broken overlay message flows. If we are bootstrapping our channels though, we don't really care about this packet drop. However, do note, you really want this filling out of your buffers to stop after a while, to resume normal control messaging and nicely load-balanced communication.

If you really want the maximum throughput you have to balance out the latency of (1) adding new peers for varied information and (2) actually receiving channels from the community for information volume. I.e., once you have a decently varied group of overlay neighbors, don't spend your valuable time and bandwidth on NAT-puncturing and keeping alive a new peer, spend it on pulling in channels. It is best to get connected to a small group of other peers and have them send their knowledge to us at a high rate. This translates to a pull-based mechanism, using a "please send me everything you have for the next X seconds"-message.

Note that IPv8 already takes care of introducing you to a decently sized pool of peers to communicate with by default (disclaimer # 2: what is "decent size" depends on the application, but we know from years of gossip reseach papers that more than 60 neighbors gives you almost no returns and 15 neighbors is about the start of the range you want to be in).

As mentioned by @egbertbouman , the current implementation asks any newly responding peer for a single channel this peer is subscribed to, once:

https://github.com/Tribler/tribler/blob/1c4335c98b9e7121ba0175b91879a2bd5f4b088b/src/tribler-core/tribler_core/modules/metadata_store/community/remote_query_community.py#L119

I agree with @egbertbouman and I suggest to instead use the aforementioned "please send me everything you have for the next X seconds"-message upon connection to a new peer instead of this very polite "please share me one of your subscribed channels"-message.

egbertbouman commented 3 years ago

@qstokkink I assumed the num_entries thing is there to prevent empty channels from being gossiped, which seems like a good idea. Database results, however, seem to be sent in 1 message/result. I think that makes sense as it just makes things easier, and it shouldn't affect channel discovery :shrug:

qstokkink commented 3 years ago

@egbertbouman I stand corrected on that then and I also agree that one result per message should be fine, as long as you send more messages if you want more results (or, as I suggested above, a single message that leads to multiple answers).

I do want to stress that the current implementation is very good about its bandwidth, its load balancing and has very readable and maintainable code. It's just not.. "mean" enough.

synctext commented 3 years ago

"please send me everything you have for the next X seconds"-message. If you really want the maximum throughput

Not maximum, just 25 UDP messages incoming would be ideal. I see too many security risks with, say, 4 MByte/seconds of incoming traffic (3000 packets per second!) . No only that congestion-control is not done yet (packet loss, buffer fill), we also have no trust function yet. Very helpful discussion.

Database results, however, seem to be sent in 1 message/result.

Perfect! Each new peer can only give us 1 entry. No compromise on security/spam. If we can talk to 25 new people per second and ask them for a random subscribed channel; very done! Can we reply to incoming new strangers as well (or we do that already..)?

https://github.com/Tribler/tribler/blob/1c4335c98b9e7121ba0175b91879a2bd5f4b088b/src/tribler-core/tribler_core/modules/metadata_store/community/gigachannel_community.py#L98

My preference is to re-use the regular gossip mechanism. I don't like to use the generic remote query mechanism (that API is great for debug obviously). Can we re-use that existing code?

qstokkink commented 3 years ago

If we can talk to 25 new people per second and ask them for a random subscribed channel; very done!

Well, I mean, we can. But, probably we shouldn't: see https://ieeexplore.ieee.org/document/6934311. Based on past both experience and science from our lab, I would suggest depending more on information propagation than peer churn.

egbertbouman commented 3 years ago

I just noticed that the gossip cache logic mentioned by @synctext is triggered by send_random_to, which in turn gets called from SyncChannels. However, SyncChannels is not loaded by the IPv8CommunityLauncher. This strategy was disabled in https://github.com/Tribler/tribler/commit/b2bb9d636c3ad6925c8c6dc0f16ab939727a2994. @ichorid Is this expected?

ichorid commented 3 years ago

Database results, however, seem to be sent in 1 message/result.

Correction: it will send you all the results of the remote query, broken down into as many (lz4-compressed) UDP packets as necessary. There is a hard limit on the sender side to prevent abuse: never send more than 50 entries.

For example, if a remote query results in N entries (channels) to be sent, these N entries will be sent as 1-N packets, depending on how many of these fit into a single packet. The sender side will proceed to pack the response entries into a single packet until it fills up to capacity, then it will send the packet and proceed to pack the rest of the unsent entries into the next packet, etc.

So, it is pretty efficient as it is.

ichorid commented 3 years ago

This strategy was disabled in b2bb9d6. @ichorid Is this expected?

Oops...

Actually, the refactored GigaChannelCommunity is now based on RemoteQueryCommunity and should use RemovePeers strategy to constantly walk and query new peers. It uses it, according to the modules catalog. Also, I checked it by hand - the strategy is indeed used.

ichorid commented 3 years ago

My preference is to re-use the regular gossip mechanism. I don't like to use the generic remote query mechanism (that API is great for debug obviously). Can we re-use that existing code?

Of course, but I would say we should make it pull-based instead, so the querying side can control the rate.

ichorid commented 3 years ago

Also, the previous (push-based) logic packed preview content into the same packet as the gossiped channel. That's inefficient. RemoteQuery features the query-back subsystem which can query as much preview content as needed, up to any channel depth. If you're concerned with efficiency, the better way to solve it is to just cache the query results on the sender side.

qstokkink commented 3 years ago

@ichorid in that case, it seems to me like all the technical infrastructure is in place and we just need to tweak some parameters to be more aggressive in order to fulfill @synctext's wishes. Is that correct?

ichorid commented 3 years ago

@ichorid in that case, it seems to me like all the technical infrastructure is in place and we just need to tweak some parameters to be more aggressive in order to fulfill @synctext's wishes. Is that correct?

Yes. However, I do not know IPv8 good enough to even imagine what kind of tweaks should be done to e.g. increase the walk speed 10 times.

qstokkink commented 3 years ago

That one is easy, divide this value by 10 (from 0.5 -> 0.05):

https://github.com/Tribler/tribler/blob/6dd092a398117e81ae7f845334e13c203466cb53/src/tribler-core/tribler_core/config/tribler_config.spec#L79

If you make the value too low, Tribler will detect main thread congestion and auto-scale back anyway.

ichorid commented 3 years ago

That one is easy, divide this value by 10 (from 0.5 -> 0.05):

https://github.com/Tribler/tribler/blob/6dd092a398117e81ae7f845334e13c203466cb53/src/tribler-core/tribler_core/config/tribler_config.spec#L79

If you make the value too low, Tribler will detect main thread congestion and auto-scale back anyway.

Is it possible to set walk_interval for a single Community?

qstokkink commented 3 years ago

IPv8 was not made to have varying walk intervals per Community (it smooths out network traffic using the fixed value - which is really cool, I do recommend checking out your network traffic in the system monitor 😃).

Perhaps it is indeed better to keep your own looping TaskManager task in your particular Community. This gives you the most freedom to control the intervals.

That said, you could also set the walk_interval to an ultra-low value. I'll leave this memo here, if you opt for that solution:

A note on `target_interval` for ultra-low `walk_interval` values

The recommended solution for ultra-low `walk_interval`s, is to set the `RandomWalk`'s `target_interval` argument to your minimum allowed interval for peer discovery. This makes sure the default random walks do not operate faster than the given `target_interval` (essentially DOS-ing the network). Concretely, this is done by changing all of these lines: https://github.com/Tribler/tribler/blob/a463040ca54a14748d661523a9d7fe7b014bb290/src/tribler-core/tribler_core/modules/ipv8_module_catalog.py#L105 to match the current default minimum of `0.5` seconds: ```python @walk_strategy(random_walk, target_interval=0.5) ``` Alternatively, you could make a partial class definition in `def random_walk()` to enfore this over the entire module catalog file.

ichorid commented 3 years ago

Perhaps it is indeed better to keep your own looping TaskManager task in your particular Community. This gives you the most freedom to control the intervals.

So, I should just create a task to trigger the walks by myself, right?

qstokkink commented 3 years ago

Yes, something like this (if you want to get really fancy you could even modulate your own interval on the fly - I won't show that here):

https://github.com/Tribler/tribler/blob/08bb535148fe08beef04f1477be2f1eb5215857b/src/tribler-core/tribler_core/modules/metadata_store/gigachannel_manager.py#L45-L47

And then, in the task itself, fire off the "pull" messages to (random?) peers from self.get_peers().

ichorid commented 3 years ago

And then, in the task itself, fire off the "pull" messages to (random?) peers from self.get_peers().

But that is not a "walk", right? e.g. this will not speed up the discovery of new peers.

Because now we query each introduced peer on introduction. So, there is no reason to query them again. Instead, we should get more introductions.

qstokkink commented 3 years ago

All the IPv8 object does, essentially, is call DiscoveryStrategy.take_step() periodically. If you take this into your own control you can call RandomWalk(my_community).take_step() and RemovePeers(my_community).take_step() to your heart's content in your own periodic task - you don't even have to register these two DiscoveryStrategy instances with IPv8.

Disclaimer. I'd still prefer depending on information propagation instead. Forcing 50000 users to hammer the bootstrap nodes is one of the few ways you can actually take our entire network down. If we do switch to "open" bootstrap nodes, this is also a good way to get IP banned.

synctext commented 3 years ago

Detailed specification

Connect using IPv8 to 200 unique peers and obtain from each of them 1 random subscribed channel (non-empty). Do this at a rate of roughly 20 UDP messages/second with signed votes per second. No invasive change. Would fill up screen fast and safe incremental step forward for Tribler.

Instead, we should get more introductions.

Yes :+1:

Yes. However, I do not know IPv8 good enough to even imagine what kind of tweaks should be done to e.g. increase the walk speed 10 times.

This was very helpful to clarify what I believe users will like.

qstokkink commented 3 years ago

Connect using IPv8 to 200 unique peers

Bump up this value from 30 to 200.

https://github.com/Tribler/tribler/blob/a463040ca54a14748d661523a9d7fe7b014bb290/src/tribler-core/tribler_core/modules/ipv8_module_catalog.py#L185

synctext commented 3 years ago

@walk_strategy(random_walk, target_peers=30)

Can this be changed dynamically? So first, 20 seconds use target_peers=200 then trottle back to target_peers=30.

qstokkink commented 3 years ago

Can this be changed dynamically? So first, 20 seconds use target_peers=200 then trottle back to target_peers=30.

Yes, the script becomes about 3 lines though. I'll help with that when the time comes.

synctext commented 3 years ago

Let only speedup the discovery process in 1 community for 10 seconds. How would you advice us to bypass the 500ms take_step() delay down to 50ms to reach target_peers=200 faster? (avoiding damage to bootstrap nodes).

qstokkink commented 3 years ago

Here's the complete example (given https://github.com/Tribler/tribler/issues/5828#issuecomment-743213137 and excluding imports):

class TheCommunityYouWishToAddThisTo(Community):

    def __init__(self, etc):
        # Initialize some stuff

        # Example starts here:
        self.start_speed_walk = time.time()
        self.walker = RandomWalk(self, timeout=10.0, window_size=200)
        self.register_task("speed walking", self.speed_walk, interval=0.05)

    def speed_walk(self):
        self.walker.take_step()
        if time.time() - self.start_speed_walk > 10.0:  # Stop after 10 seconds.
            self.cancel_pending_task("speed walking")

ichorid commented 3 years ago

Forcing 50000 users to hammer the bootstrap nodes is one of the few ways you can actually take our entire network down.

I always thought a community only gets the first introduction from the bootstrap nodes, and the rest of the addresses/punctures are introduced directly by other peers?

qstokkink commented 3 years ago

I always thought a community only gets the first introduction from the bootstrap nodes, and the rest of the addresses/punctures are introduced directly by other peers?

No, that causes severe network partitioning/clustering. For example, in practice we experienced situations where 4 local nodes would form 2 groups of 2 (causing silly stuff). You need to periodically bootstrap to remain connected to a single cluster (aka a robust network topology).

Regarding contacting bootstrap nodes, there are two extremes you don't want to be in:

Contact the bootstrap servers too much: kill the bootstrap nodes and everyone has slow introductions (high latency).
Contact the bootstrap nodes too little: form a disconnected network partition and you fail to get varied information from the network.

ichorid commented 3 years ago

Well, we have a special case for these "fast walks" case: we don't care about connectivity too much. E.g. can we disable bootstrap-node-querying when calling take_step manually?

qstokkink commented 3 years ago

Yes, in that case you can do something like this in the speed_walk() method:

if len(self.get_peers()) > 20:  # Stop walking to bootstrap nodes after you have 20 peers.
    self.walker.reset_chance = 0

ichorid commented 3 years ago

I tried this, kinda works... Next, we must fine-tune the parameters of the walking process. Also, we need to decide if we want to send all the channels, not just the subscribed ones. Now, this seems to look suspiciously like the PopularityCommunity problem... @xoriole , @kozlovsky , @drew2a , may I kindly ask you guys to take over this task? You already got an infrastructure for working with this kind of experiments. Also, this one is easier than the PopCom, as the dataset is so much smaller, and we're in full control of the DB.

(p.s. There is a still a bug with the GUI not showing updates sometimes, and a problem with dynamically sorting stuff, I will solve these of course.)

drew2a commented 3 years ago

Quick update:

I've ran an experiment with the suggested tweaks, and this is the raw results.

The experiment lasted one minute.

Below are presented two runs:

Normal: data from normal Tribler (no tweaks)
Tweaked: data from Tweaked Tribler

In both runs two values were measured

Peers: len(self.get_peers())
Response count: count of introduction_response_callback calls.

Tweaks implemented:

class ChannelDiscoveryBoosterMixin(RemoteQueryCommunity):
    TIMEOUT_IN_SEC = 60.0
    MAX_PEERS = 200

    # Stop walking to bootstrap nodes after you have this number of peers.
    RETURN_TO_BOOTSTRAP_THRESHOLD = 20

    TAKE_STEP_INTERVAL_IN_SEC = 0.05

    def __init__(self, *args, **kwargs):
        super(ChannelDiscoveryBoosterMixin, self).__init__(*args, **kwargs)
        self.logger.info(f'Init. Timeout: {ChannelDiscoveryBoosterMixin.TIMEOUT_IN_SEC}s, '
                         f'Max peers: {ChannelDiscoveryBoosterMixin.MAX_PEERS}, '
                         f'Take step interval: {ChannelDiscoveryBoosterMixin.TAKE_STEP_INTERVAL_IN_SEC}s')

        self.walker = RandomWalk(self, timeout=ChannelDiscoveryBoosterMixin.TIMEOUT_IN_SEC,
                                 window_size=ChannelDiscoveryBoosterMixin.MAX_PEERS)

        self.register_task('take step', self.speed_walk,
                           interval=ChannelDiscoveryBoosterMixin.TAKE_STEP_INTERVAL_IN_SEC)

    def speed_walk(self):
        self.logger.info('Take a step')

        self.walker.take_step()

        # Stop walking to bootstrap nodes after you have N peers.
        if len(self.get_peers()) > ChannelDiscoveryBoosterMixin.RETURN_TO_BOOTSTRAP_THRESHOLD:
            self.logger.info('Stop returns to bootstrap')
            self.walker.reset_chance = 0

drew2a commented 3 years ago

Quick update.

The data from the experiment with implemented almost all requested changes.

The experiment duration: 1 min Boosted period: 30 sec

Peers — peers count
Normal: Channels — channels count with no tweaks (normal Tribler)
Tweaked: Channels — channels count, obtaining with tweaks that described above.

The remote query used:

        self.send_remote_select(peer=peer, metadata_type=CHANNEL_TORRENT, subscribed=True,
                                attribute_ranges=(("num_entries", 1, None),), last=1)

The count query used:

     channels_count = count(ts for ts in self.mds.ChannelNode if ts.metadata_type == CHANNEL_TORRENT)

ichorid commented 3 years ago

        self.send_remote_select(peer=peer, metadata_type=CHANNEL_TORRENT, subscribed=True,
                                attribute_ranges=(("num_entries", 1, None),), last=1)

Well, you're missing the opportunity to put more than one channel entry into a single UDP packet. Typically, a single channel entry takes about 250 bytes. So, if you change it to last=3, it is practically guaranteed that the response packet will contain at least 3 channel entries. In general, if all the queried items won't fit into a single UDP packet, these will be split into as many packets as necessary (packing as many entries into each packet as possible).

Also, keep in mind that more queries = more load on the network. So, I don't see any reason to include last=1 at all. If you omit the last argument, the query will try to send every subscribed channel, up to a safety limit on the responder side (that's about 50 entries IIRC).

drew2a commented 3 years ago

@ichorid thank you for the comment, I keep it in mind.

The reason, why I publish charts and PR is to get a feedback as earlier as possible.

Why did I use last=1. Because it gives us a simple picture to discuss. Not because it is the most efficient way to obtain more channels.

drew2a commented 3 years ago

@ichorid by the way, setting last to last=3 does'n really affect overall count of received channels.

My guess is that the ordinary user haven't subscribed channels at all.

qstokkink commented 3 years ago

@drew2a could you include subscribed=False with different levels of last in your benchmark?

drew2a commented 3 years ago

@qstokkink yes, sure.

I have a question here. From run to run I see that "peers count" get stuck in range 30-60. Do you know why? My expectation: this count should be higher.

qstokkink commented 3 years ago

It's because the RandomWalk strategy picks a random peer from your known peers to get an introduction from. Basically, the more peers you get, the higher the chance you get a peer you already knew. Essentially, you need to do an exponential amount of work for every new peer. You could experiment with the EdgeWalk strategy to overcome this.

drew2a commented 3 years ago

@qstokkink thank you, that was my main assumption... what arguments do you suggest to use in EdgeWalk?
edge_length, neighborhood_size, edge_timeout

Benchmark update: Experiment duration: 60s Boost period: 30s Last: 50

qstokkink commented 3 years ago

I've never actually tried to prioritize getting a substantial amount of peers as quickly as possible. Intuitively, I'd start with a neighborhood size of 30 and an edge length of 10, edge timeout can probably stay at the default. You'd have to experiment a bit to find the optimal values.

Also, regarding the recent result, omitting subscribed leading to many more results is quite surprising to me. What in the world is going on there? Was that a lucky run? Is the remote select broken? I would expect the omission to be roughly equal to subscribed=False + subscribed=True.

drew2a commented 3 years ago

Also, regarding the recent result, omitting subscribed leading to many more results is quite surprising to me.

Yes, it is strange.

Of course, I could have made a mistake in the query. To minimize the chance of this happening, I published a query in this issue, and opened a PR (in hopes that it would be seen).

@ichorid could you verify the query?

synctext commented 3 years ago

Solid progress! Great measure&exploration work. Assuming these 60 seconds measurements are with the live network? For bootstrapping we want to fill the screen and amaze the user. Feel free to leave our subscribe-filtering for bootstrap.

drew2a commented 3 years ago

Assuming these 60 seconds measurements are with the live network?

Yes

ichorid commented 3 years ago

@ichorid could you verify the query?

Congratulations, sir, you found a bug! :tada: :bug: :tada: The bug is: https://github.com/Tribler/tribler/blob/b6774caf569ef71dfa078b0355e63aea5f69ea32/src/tribler-core/tribler_core/modules/metadata_store/orm_bindings/metadata_node.py#L109

you guess it should be lambda g: g.subscribed==subscribed instead. The result of the buggy current version is:

if subscribed is None, it will query for everything
if subscribed is True it will query for subscribed channels
if subscribed is False it will still query for subscribed channels

ichorid commented 3 years ago

Also, be warned that if you're going to get all the channels (and not just the subscribed ones), you lose the advantage of filtering junk. Consequently, you lose the information necessary to build the beautiful VSIDS-based channels rating.

drew2a commented 3 years ago

@ichorid I understood everything, except:

you lose the information necessary to build the beautiful VSIDS-based channels rating

ichorid commented 3 years ago

you lose the information necessary to build the beautiful VSIDS-based channels rating

When you query other peers for subscribed channels, there is a callback on receiving new entries, that bumps (add votes to) the received channels:

https://github.com/Tribler/tribler/blob/b6774caf569ef71dfa078b0355e63aea5f69ea32/src/tribler-core/tribler_core/modules/metadata_store/community/gigachannel_community.py#L92

In a sense, the voting system uses pull-based gossip as a "cue" about what is popular. There is no explicit "peer X voted for channel Y" message.

If you are going to query remote peers for non-subscribed channels, you have to disable this callback. Otherwise, all of the received channel entries will be bumped (i.e. marked as voted by the peer who sent you these). This will result in breaking the distributed algorithm for voting, which is:

subscribing to a channel counts as voting for it
every vote gives +1 to the channel's rating
channel ratings decay exponentially over time
repeated votes to a channel from the same user are renewed but do not accumulate.

If there is no difference between "subscribed" and "not subscribed" channels, there is no difference between "voted" and "non-voted" channels. In that case, every channels' rating will eventually become (almost) the same.

To circumvent this problem without changing the packet format, or introducing different community, or doing Bloom filter, whatever, the simplest thing is to simultaneously query the same host for both all the channels (w/o vote-bumping callback), and for subscribed channels only (with vote-bumping callback).

This will still result in the propagation of junk, but at least it will keep the voting system working, which is the only thing that saves the user from being completely overwhelmed by junk.

drew2a commented 3 years ago

@ichorid thank you for the explanation. It is clear now ✊.

@qstokkink regarding EdgeWalk...

I made an experiment today in which I tried to estimate neighborhood size and edge length for maximazing peers count for 30s interval.

Pseudocode of the experiment is the following:

for neighborhood_size in [1, 25, 50, 75, 100]:
   for edge_length in [1, 12, 25, 37, 50]:
       for repeat in range(5):
           calculate_peers_count(neighborhood_size, edge_length, timeout=30)

Surprisingly, the best (average) values are:

neighborhood_size = 50
edge_length = 50

The row data in csv format: edge_walk.csv.zip

And this is a visualisation (but, unfortunately, it's not really obvious)

qstokkink commented 3 years ago

@drew2a Nice graph 👍

From your results it seems 25/25 is the most balanced choice. We want the neighborhood size as small as possible to make sure we don't depend too much on the bootstrapping mechanism (to hook in external bootstrap nodes we need to cure our performant bootstrap node addiction 😃).

Tribler / tribler

Fast channel discovery after first startup #5828

Detailed specification