zeromq / zyre

Zyre - an open-source framework for proximity-based peer-to-peer applications
Mozilla Public License 2.0
870 stars 172 forks source link

Network discovery #701

Open ghenry opened 2 years ago

ghenry commented 2 years ago

Hi all,

I was just reading https://zguide.zeromq.org/docs/chapter8/ and saw the Harmony pattern, then got on to http://randalh.blogspot.com/2012/12/zeromq-true-peer-connectivity-harmony.html?m=1 which mentions Zyre. Hence ending up here :-)

I'm currently exploring the sharing of data p2p for https://github.com/ghenry/SentryPeer and wanted to use 0mq for handling this at the application layer, not via replicated storage like rqlite (https://github.com/rqlite/rqlite). Are there any example use cases for Zyre, but with some kind of tracker? I'm really, really trying to not have any centralised systems.

Thanks, Gavin.

brettviren commented 2 years ago

SentryPeer looks cool. I guess its p2p discovery mechanism primarily must work over WAN while Zyre emphasizes LAN (via UDP broadcast) and it works well there. Zyre can instead connect via TCP to some "well known" zgossip servers. This works on the WAN if the zgossip servers are accessible by the peer. Of course, it also adds some amount of unwanted centralization.

I've often tried to think of a solution to WAN discovery and yet to come up with something truly decentralized. One can push the problem "up" but I always comes back to needing some "well known" entity "out there". The best compromise I have thought of is to use a variety of "well known" entities speaking a variety of protocols and with a variety of ways to find them. Eg, DNS + IPFS + bittorrents DHT + hard coded IPs + ....

Even after discovery there is still the problem of how one peer may connect to another. With Zyre (or ZeroMQ in general) that means having a TCP bind() address visible to the peer. Zyre chat-based protocol works fine on the LAN but I think even when using zgossip for discovery the chat socket must be directly peer-accessible. ZeroMQ's RADIO/DISH draft sockets using UDP may be something to look at if they can do UDP hole punching. I've not tried this yet but I guess in principle if both ends can know each other's UDP/IP address they can hole punch.

I think solving this is second decentralized connection problem is akin to the decentralized discovery mechanism in that supporting multiple protocols is needed. Eg, simple Zyre "chat" on LAN + some TCP based relay mechanism maybe through the same servers providing zgossip + WAN UDP if hole punching works + ....

ghenry commented 2 years ago

Hi @brettviren

Yep, you hit the nail on the head. Over WAN, but also it could be configured for the LAN if a user runs a few nodes themselves. The more I think about the current internet/WAN, I think the same model needs to be used. For example, like the Root servers in the DNS world. "well known" IPs.

Yeah, something needs to be there to bootstrap.

Regarding connecting one peer to another, I think we can cover that in the INSTALL/ADMIN guide as these nodes will live on public IP addresses, so we can list known ports.

I think I just need to make the decision about hosting something "well known" and paying for that for this to work, at least for the bootstrap part. We could do something to "Register here to run a node" then that's somewhat trusted, but then we're moving to centralised. It is easier, but not my goal :-)

We can still do the p2p sharing of data and "I'll only give you my data if you give me some I have or haven't seen before". That's my main goal really, crowd sourcing of data, data ownership and sharing.

Thanks for taking the time to reply and spend some brain power on my questions.

Gavin.

brettviren commented 2 years ago

There are lots of very interesting problems in the system you describe.

When you talk about sharing the actual data, the "tit for tat" exchange you describe is very reminiscent of bittorrent. Maybe directly using BT protocol for data transfer is something to think about. (Independent of considering its DHT as part of discovery).

There's also an element of achieving agreement and honesty. I say IP a.b.c.d is a spammer but you say it is good, how do we reconcile? Maybe a third peer breaks the tie? Perhaps etcd or other RAFT implementations are worth investigating.

But then, how to protect the data against malicious, perhaps coordinated, attempts to poison it? Simple majority rules can be defeated. This leads to an interesting problem of dealing with "meta spammers". Is a MetaSentryPeer required? And a Meta....MetaSentryPeer?!

When I think about these kind of problems, I often fall into thoughts that resemble "web of trust" using public key crypto. If I judge a peer M to be malicious and peer T to be trusted, I can ignore any data M endorses unless T also endorses.

I think designs decisions addressing these points are rather intermixed. Interesting problems, indeed!

ghenry commented 2 years ago

Made quite a bit a progress now, so will be moving on to the above problems in the next few weeks. I've opted to have a bootstrap.sentrypeer.org for the notes with a DHT (OpenDHT maybe) to get started. I'll be back to bug the project soon I'm sure, but I'll contribute any docs that I come across etc.

Thanks.

ghenry commented 2 years ago

Or rather

"well known" zgossip servers

ghenry commented 2 years ago

And also whether the user operating requirements are to run in private address space or public addressing. Probably the latter. And also whether the titfortat/data replication part is in the same binary or not. This bit could run in private nodes, but makes it all harder.

sphaero commented 2 years ago

I'm very curious about any progress on this! This is one of the holy grails on low level internet. How are you coming along?

ghenry commented 2 years ago

I'm still at the feature stage of, mainly the Web UI to present gathered data and Web API. Then I'm moving on to the p2p part.

I'll post any findings here though on how/what I chose if that's OK?

ghenry commented 2 years ago

Oooo, awesome. Going to have a read of this https://ieeexplore.ieee.org/document/4604572

ghenry commented 2 years ago

Great discussion of the issue I'm trying to solve by an author of the paper above - https://grothoff.org/christian/dasp2p.pdf

ghenry commented 2 years ago

Have got my project an IPv6 multicast address as per recommendation of Christian (link above) who is now part of the GNUnet project!!! How weird is that!

https://www.iana.org/assignments/ipv6-multicast-addresses/ipv6-multicast-addresses.xhtml

I'll be using this in Zyre LAN side and starting the next few days. Still reading papers about WAN bootstrapping and will update here.

Feel free to tell me to go away now :-)

sphaero commented 2 years ago

No please don't go away! We could really need wan discovery besides broad-/multicast and gossip discovery. Especially if it's a more widely adopted standard.

ghenry commented 2 years ago

Will keep you posted every now and then :-)

sphaero commented 2 years ago

interesting talk: https://fosdem.org/2022/schedule/event/peer_to_peer_hole_punching_without_centralized_infrastructure/

ghenry commented 2 years ago

Cool!

Will watch. I was just thinking about NAT detection too!

On Sun, 6 Feb 2022, 13:11 Arnaud Loonstra, @.***> wrote:

interesting talk: https://fosdem.org/2022/schedule/event/peer_to_peer_hole_punching_without_centralized_infrastructure/

— Reply to this email directly, view it on GitHub https://github.com/zeromq/zyre/issues/701#issuecomment-1030829855, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG66HBL43LFHDNWS5UZE3UZZXRTANCNFSM5GLQ3ULA . You are receiving this because you authored the thread.Message ID: @.***>

sphaero commented 2 years ago

I think he will continue his talk in a few minutes here: https://fosdem.org/2022/schedule/event/libp2p/

ghenry commented 2 years ago

Watched and read, but it still uses a centralised system, albeit different nodes they have to find via DHT first to find https://github.com/libp2p/specs/blob/master/relay/circuit-v2.md so no bootstrapping magic yet. I think the title is a bit misleading :-)

ghenry commented 2 years ago

I've started this. Hopefully I'll come up with something as I'm reading everything paper, rfc, book, blog post and project I can find.

https://github.com/SentryPeer/draft-henry-p2p-network-discovery-internet-XX

Interesting things in here https://www.rfc-editor.org/rfc/rfc8155.html

sphaero commented 2 years ago

I don't think there exists a wan discovery method without some fixed points, e.g. DNS, ipaddresses, etc. I would really like to know if there would be something. NAT traversal is the biggest hurdle . I think if NAT is passed gossip discovery would be sufficient for zyre to run on the WAN.