Where is Socket's "Signaling Server"?

mehdi-cit commented 1 year ago

Hello and thank you for this promising tech. My understanding and limited experience of working with P2P indicates there has to be a "Signaling Server" to ensure discovery. So for peer X to be able to connect with peer Y. They need to "meet" in the signaling server, exchange there their current network address and from that point on be able to connect to each other without involving servers! I cannot see any mention of such server or how does socket avoid having to require one. Say peer X is in Europe and peer Y is in Vietnam. How are they going to discover each other. Hop "aimlessly" in the world wide internet until they somehow bump on each other? What's the trick here? Also, having a full mesh topology with all peers connecting to all other peers is not scalable. You can only have a maximum number of fully connected peers. How does Socket address that without involving some sort of Server. Thank you again and best of wishes with your endeavor!

PS: I have not worked with P2P (webrtc) in a long time, so both vocabulary and semantics written above could be "obsolete" and would happily get corrected.

RGBboy commented 1 year ago

Others probably have a better understanding but here is mine.

From my understanding (other please correct if I'm wrong):

Peers don't connect to every other peer, instead they connect to a few peers forming a replicating network
Messages are securely propagated across the network using the protocol mentioned here https://socketsupply.co/guides/#protocol-design
Each peer may participate in this propagation, even if messages are not intended for them
Currently the peer implementation within socket provides the address of a few known servers for peers to connect to. I imagine these become less relevant as the network grows however at this stage they are needed to "seed" the network

mehdi-cit commented 1 year ago

Thank you @RGBboy that's my "understanding" from reading the docs. However, I cannot see this working at scale (example of a peer X in Northern Europe trying to connect to a peer Y in Vietnam. How long till they discover each other if they do not get each other's "address" somehow ?). I read somewhere in the docs that such protocol/algorithms works surprisingly well. I guess I can see that working if peers are more or less close bye (UDP broadcast) but I do not "feel" this would just work for any scenario. Again, networking/protocols is not my specialty and I could be missing something here.!? Wouldn't introducing addressable and permanently online "signaling/relaying servers" add some robustness and "certainty" to the system?

getify commented 1 year ago

The response by @RGBboy is pretty accurate. To clarify a few things...

they connect to a few peers

Specifically, 3 previously known/communicated-with peers... or fixed "introducers" if it's the first connection and no other peers have been found previously.

provides the address of a few known servers for peers to connect to. I imagine these become less relevant as the network grows however at this stage they are needed to "seed" the network

Yes, we run a few dozen "Introducers", which are basically very simple servers running our relaying protocol. These IPs/ports are static/fixed, and are currently hard-coded into Socket apps. Those are who Socket apps first start talking to whenever they come online initially, and those introducers then introduce them to many other peers. Pretty soon, an app is connected to peers much "closer" (faster) to it than the introducers, so the app is not really talking to the introducers any more.

NOTE: it's important to note that "connection" doesn't mean persistent connection as in TCP -- this is all UDP and that has no persistent connections, only listening ports. So connection here just means "two peers know about each other's IP/port, for as long as those origins are still able to send/receive packets".

Our introducers are intended to be temporary while the P2P network scales up. Eventually, when the network has several tens of thousands of peers for example, it will be behaving as a global, fault-tolerant, always-on network. That is, there will always be enough peers live online and responding/relaying, that any new peer will be able to get back online by simply pinging one of its previous peers (who are still online).

At this point, the network will be resilient enough that we can decommission some or all of those introducers. That's our intention. It remains to be seen/proven how quickly the network will reach this level of stability, but we don't expect it to take too long.

Each peer may participate in this propagation

All peers act as relays for all other peers. A "peer" is an install of a Socket-powered app on a device, that is currently running and connected to the internet.

All peers relay packets they receive from other peers, regardless of what app (clusterID) the packet is associated with or who the packet is encrypted (to public-key field) for. Once a packet arrives at the peer it's intended for (encrypted for), that peer decrypts it rather than relaying it (although relaying might still occur under some circumstances).

Moreover, once a peer receives a packet intended for it, they also are likely to have been "introduced" to the peer who originated it, or at most one hop away (in the case where the two peers cannot directly connect because of NAT type incompatibility).

That means that pretty quickly, two peers are either directly connected, or connected at most 1 indirect hop away.

If a connection between two peers is "strong" (communication goes quickly, etc), then generally the selection of who to relay packets to will likely include that peer, meaning that a UDP packet from A -> B will likely go directly from A to B (as well as to a few other peers, for robustness sake, in case B isn't actually online anymore).

That will mean that communication between A and B will, in general, be very fast/efficient.

And to address some of @mehdi-cit's specific comments/questions:

broadcast... Hop "aimlessly" in the world wide internet until they somehow bump on each other?

There are no true broadcasts happening here. We "multi-cast" UDP packets, meaning sending identical packets to multiple destinations.

there has to be a "Signaling Server" to ensure discovery

The beauty and power of Socket's P2P protocols is that no such servers are required. Our NAT reflection process uses two peers to ping-pong reflect back a peer's IP/port, and by comparing the responses, we determine NAT type almost immediately. From there, the app re-connects to introducers and/or previous-known peers, and starts filling up its relay cache and sending out its own packets.

This is all "zero server" by design. Even if temporarily some of those peers are actually servers in a cloud somewhere, conceptually, these aren't privileged or special servers in any way, they're just equal peers running the protocol, that happen to be on fixed IP/port assignments that make them very reliable/predictable. They're only intended to be used while we bootstrap the P2P network into stability existence, then they'll probably go away.

Of course, some apps may choose to run their own "introducer"-like relay peers at fixed cloud-server locations, either temporarily or permanently. But again, in spirit, these will look like just "always-on, reliable peers", not special servers with elevated capabilities.

having a full mesh topology with all peers connecting to all other peers is not scalable

We appreciate this perspective and assumption. It's probably partly true and partly not true. But regardless, that's not really in spirit what we're building, so this concern is not relevant to our protocol/design.

I cannot see this working at scale (example of a peer X in Northern Europe trying to connect to a peer Y in Vietnam. How long till they discover each other if they do not get each other's "address" somehow ?).

Your skepticism is understandable, but all we can say is, we've done an extensive amount of modeling (simulation) and real-world testing, and I think your concerns here are not going to be relevant as the network grows.

One of the best parts of building with P2P architecture is that it "scales" inversely compared to cloud-centric architecture. What this means, when you centralize on the cloud, the more users you get, the more you have to pay the cloud to scale up to handle all those connections. But with P2P, the more peers that come online, the more robust and scalable the network becomes, by default and by design, and the "costs" go down (to zero).

It's true that right now, we're in the very early days of the P2P network, so maybe there's only dozens or hundreds of peers in circulation.

But because our peer-relay works in an app-agnostic way, the bigger the network gets, the more a brand new app immediately achieves/benefits from that same scale, because all peers (regardless of app type) will be relaying its packets, even if there's only a few users of that specific new app.

socketsupply / socket

Where is Socket's "Signaling Server"? #476