hypercore-protocol / community

Public discussions for Hypercore Protocol
16 stars 1 forks source link

Proposal for Trackers #3

Open creationix opened 4 years ago

creationix commented 4 years ago

If I understand correctly (I've been out of the hypercore community for a bit), the new DHT is used for both NAT traversal via hole punching as well as advertising and discovering peers. Combining these two into the same protocol/service was a brilliant move!

But there are some trade-offs in the decentralized nature of a DHT that are not desirable for all use cases. In particular, I have use cases where I'd love to use the other parts of the ecosystem (P2P hosting, P2P sockets, content addressed content, etc), but need faster performance and am willing to centralize part of the system to achieve this goal.

My proposal is to add a discovery/connection protocol that's similar to the DHT, except the address of a public server is part of the address (tracker address + public key instead of just public key). The tracker will know the current location of peers with the advertised content and be able to connect me quickly.

Also, some trackers can optionally act as a cloud cache of the data itself to speed up the process even more.

This is the workflow I imagine for publishing data.

  1. Create a local hypercore
  2. Connect to a tracker with optional authentication and advertise the new hypercore. 2a. If the tracker is also configured as a data cache, it starts syncing and storing data.
  3. Share a url that contains both the tracker's address as well as the normal public key.

Then the flow for reading this data:

  1. Get the address + public key from a link or something.
  2. Connect to the tracker asking for peers
  3. The tracker connects you with the advertised peer(s). 3a. If configured as a storage peer, it can also provide data directly.

I'm not sure exactly what would need changing in the protocol to allow this, but I do think that making it a clearly defined protocol could encourage commercial offerings of such trackers. It would probably be good if the tracker protocol allowed extensions for other value providing services or blockchain integration or whatever the provider wanted to integrate.

creationix commented 4 years ago

Note that this new discovery method doesn't have to be mutually exclusive with LAN multicast discovery or DHT discovery. The content itself is still addressed using the public key. The tracker is merely a strong hint and a dedicated server to speed up the connection.

Another service the trackers could optionally provide is to relay the data when hole punching doesn't work.

The important point here is these servers are not the source of truth, they are services to make the experience better and faster. Everything should continue working without them (if the data is also advertised on the DHT)

A fully configured tracker could be the equivalent of stun + relay + caching proxy. Being a caching proxy is especially useful in the case of being a relay since it had the data going through it anyway.

But a minimal one, or the free tier or something could be just replacing what hyperdht does .

mafintosh commented 4 years ago

I’m +1 on trying an experiment on this. If this ran over http it could be good for corporate networks also, assuming that scales for this use case

creationix commented 4 years ago

I could see value in a HTTP version for use cases where hole punching is just not going to work. It would need to relay the data anyway. The file data could act like cached http data and the peersockets could also be proxied over websockets.

But if we do want to act more like the DHT and only facilitate P2P connections, then we would need something UDP based for the hole punching to work right?

mafintosh commented 4 years ago

That is true. A udp api to a centralised tracker is prob good for corporate still

creationix commented 4 years ago

Is the protocol still μtp based or did it switch to something with higher throughput?

mafintosh commented 4 years ago

utp + tcp. I did some perf fixes to our bindings so the throughput on utp is quite good now.

If you know any good utp replacements I'm down to investigate changing that part also, as long as they are easy to multiplex with the dht traffic (utp allows that which is awesome)

creationix commented 4 years ago

I love μtp, my only concern was the protocol itself is designed to be a background protocol and is very "polite" and backs away if there is any congestion. I've seen cases in some other experiments where it can get very slow on busy networks.

If this hasn't actually been a concern, I'm fine with it.

In my experiments, I was looking at https://en.wikipedia.org/wiki/QUIC as the main alternative, but not sure it's worth the trouble. Did you look into it yet?

mafintosh commented 4 years ago

I did look at quic, but I’m a sucker for the simplicity of utp. I know a friend is working on native bindings for quic right now so we can hopefully test it soon.

Have you ever messed around with the backoff configurations with utp?

mafintosh commented 4 years ago

We did talk about the backoff quite a bit, and it’s def a concern

creationix commented 4 years ago

I've not tried working with the backoff configurations. That may enough. I'm also a sucker for simplicity.