Unreachable Providers (server): Provider-Record Prioritization

dennis-tra commented 1 year ago

Checklist

[X] My issue is specific & actionable.
[X] I am not suggesting a protocol enhancement.
[X] I have searched on the issue tracker for my issue.

Description

Context

Our recent measurements show that IPFS has trouble fulfilling its baseline practical use case of hosting static websites.

The reasons are at least twofold. On the one hand, large content providers still have trouble announcing all their CIDs to the DHT, and on the other hand, the existing provider records point to unreachable peers. This issue addresses the second reason.

As an example, the below graph showcases the unique providing peers as identified by distinct PeerIDs discovered throughout a specific day in the IPFS DHT for ipld.io (source).

website-trend-providers-ipldio

The graph shows that >70% of provider records point to peers that are not reachable. The remaining peers are mainly only reachable via a relaying peer. This either increases latency, if the traffic is relayed or increases time to connect, if the relay is used to facilitate a hole punch.

This trend is not unique to ipld.io but a general theme among all our measured websites: https://probelab.io/websites/.

Problem

When a peer tries to access a website over IPFS, it looks up the provider records in the DHT and tries to connect to the returned peers with a concurrency factor of 10. Due to the large number of provider records that point to peers that are long gone, the peer likely receives such records and therefore will time out in an attempt to establish a connection with them. This significantly lengthens the resolution process to the extent that the whole operation potentially times out (that's a hypothesis).

Proposal

We identified two ways forward to alleviate the above issue.

A prioritization logic of provider records on the server side. The peers that serve provider records sort them in such a way that, e.g., the first one in the list likely contains a peer that is actually reachable.
A delayed provider record publication. E.g. only announce blocks if a peer was online for some time. The assumption is that this will filter out rather short-lived peers.

This GH issue is for proposal 1).

The corresponding issue for 2) is #9984.

Provider Record Prioritization

There was a conversation on 2023-06-19 between ProbeLab and some of the Kubo maintainers where the following strategy was proposed:

We tally consecutive reprovides for each (PeerID, CID) tuple. When another peer looks up a certain CID, it gets served the peers with the highest number of consecutive reprovides. We use this number as a proxy for the uptime of a peer. Another factor we could account for is the type of Multiaddresses a peer has announced. We should prioritize un-relayed peers over relayed ones.

There are a few things to consider (non-exhaustive list):

We could overwhelm stable peers. By always serving the same stable peers as providers for a certain CID, we could put an undesirable amount of load on them. You could also argue that this leads to some kind of centralization. A potential mitigation could be to define an upper limit a counter can reach. In case of a tie, the servers could return the peers in random order.
Daily periodicity of peer presence could lead to false positive counter increases. It could happen that peers that exhibit daily periodic uptime, e.g., are booted only in the morning for a few hours and then shut down could end up as stable peers because the provider record TTL is 24h.
A nice side-effect could be that this approach decreases the risk of unintentional CID eclipsing. If a peer is constantly restarting and generating a new peer ID (but using the same provider store) it would eclipse the CID with bogus records. The proposed logic would deprioritize these provider records. Bad actors could still quite easily generate peer IDs and refresh records at the right times at a low cost.
How do network topology changes affect the counting of reprovides? If a peer is among the 20 closest to a CID at $t_0$ it is not necessarily among them at $t_0 + 22$ hours (the reprovide interval). This could decrease the effectiveness of this proposed solution.

Concrete Proposal

I think a concrete proposal fosters efficient discussion. Here's my take:

Count the number of consecutive reprovides. The counter can be 3 at max. Return the provider records in an order that prioritizes a high counter value. In case of a tie, prioritize un-relayed peers. In case there's still a tie, randomize the order with each response. A counter increase is only allowed every ReprovideInterval/2.

Measurements

TBD: How can we substantiate the proposal with numbers? some ideas

References

iand commented 1 year ago

I'm not sure that sorting/prioritizing the provider records will help that much. It would only help if the DHT server has multiple provider records for the requested CID. If it has only one then it will need to return it which is the current situation anyway. We could gather metrics on the number of provider records held by servers on average for each CID.

Even if servers respond with their "most viable" provider records the client still has to pick from the aggregated responses. The slight skew towards more viable records may not be that noticeable.

It would be better if the DHT protocol could return the age of the provider record which would let the client order by recency, the assumption being that peers that have recently reprovided the CID are more likely to be online.

guillaumemichel commented 1 year ago

IMO it would be more efficient to add a TTL field to the protobuf provide request message (non breaking protocol change).

See https://github.com/protocol/network-measurements/issues/49#issuecomment-1599099089

The 2 suggested solutions sound a bit like quick-fixes, that we may want to remove once we have the cleaner and more efficient protocol change. IDK if it's a good time investment.

yiannisbot commented 1 year ago

How do network topology changes affect the counting of reprovides? If a peer is among the 20 closest to a CID at $t_0$ it is not necessarily among them at $t_0 + 22$ hours (the reprovide interval). This could decrease the effectiveness of this proposed solution.

I think we've looked at it with the CID Hoarder study that @cortze did. Haven't searched to find it right now, but will have a look. IIRC, we found that they remain among the 20 closest.

cortze commented 1 year ago

@yiannisbot , the study from August 2022 showed a stable in-degree ratio over 80h -> link to the section in the study

dennis-tra commented 1 year ago

@cortze could you remind me again how "in-degree" is defined exactly?

cortze commented 1 year ago

could you remind me again how "in-degree" is defined exactly?

yep, it refers to the original PR Holders for a CID that are present in the set of 20 closest peers

dennis-tra commented 1 year ago

Paris discussion from 2023-07-19:

We think this should be part of the DHT codebase and plan to pick it up as part of the refactoring work. If someone wants to lend a hand with this let us know 👍

ipfs / kubo