eclipse-zenoh / zenoh

zenoh unifies data in motion, data in-use, data at rest and computations. It carefully blends traditional pub/sub with geo-distributed storages, queries and computations, while retaining a level of time and space efficiency that is well beyond any of the mainstream stacks.
https://zenoh.io
Other
1.41k stars 146 forks source link

Question: is Zenoh intended to work over rapidly churning networks? #113

Open ckaran opened 3 years ago

ckaran commented 3 years ago

I ran across Zenoh recently, and think it may be useful for a project I'm working on. My issue is that data needs to be replicated across a network that is constantly experiencing severe churn (all nodes are highly mobile and only intermittently in contact with one another). Network partitions are the norm, and can last from seconds to hours, with little or no warning. Moreover, routes are changing moment by moment; the only thing that I think might work would be multicast UDP with a custom gossip protocol, but I'm not sure yet about that part. The real question is, can Zenoh as a whole handle this kind of constant churn? Or is it really intended for more stable networks?

OlivierHecart commented 3 years ago

Zenoh clearly targets dynamic networks where nodes are constantly joining, leaving and or moving and where the network topology changes frequently.

From our past experience UDP multicast is often poorly supported especially by radio based networks (even WiFi) and by mobile devices. It is not suitable to reliably transport data payload. That’s why, zenoh leverages UDP multicast for dynamic discovery only in combination with a gossip discovery. Data goes through unicast communications (TCP, UDP unicast, etc …)

zenoh supports a variety of communication models including peer-to-peer as well as routed communications. A LinkState protocol is used to discover and monitor the network topology and compute data routes. zenoh constantly monitor network topology changes and adapts to them (joining/leaving nodes, joining/leaving sub-systems, broken/new links, etc …)

We ran some experiments with a 100 nodes network (low latency). It takes approximatively 1 second for all nodes in the network to discover a topology change and recompute data routes. During this second, zenoh continues to operate, but some subscribers in the system may miss some data samples. An end-to-end reliability can be set up to overcome this problem.

ckaran commented 3 years ago

UDP multicast is often poorly supported especially by radio based networks (even WiFi) and by mobile devices.

My apologies, I meant to say UDP broadcast. That doesn't change your later comments, just fixing my mistake...

From our past experience UDP multicast is often poorly supported especially by radio based networks (even WiFi) and by mobile devices. It is not suitable to reliably transport data payload. That’s why, zenoh leverages UDP multicast for dynamic discovery only in combination with a gossip discovery. Data goes through unicast communications (TCP, UDP unicast, etc …)

There may be a better way than this. I used raptorq in my dissertation work to packetize large messages. This increased the number of bits broadcast over the air, but also significantly improved reliability as the loss of a few packets didn't mean I lost the entire message (all my work was done over unreliable, UDP broadcast-like simulated networks). What I didn't have a chance to do was to combine this with Triangular Network Coding (TNC). In theory, if you use TNC to combine several messages together, and then encode & packetize them with raptorq, both your latency and throughput will improve. Note that I have not yet had a chance to test this out, so test, test, test before committing to it!

Zenoh clearly targets dynamic networks where nodes are constantly joining, leaving and or moving and where the network topology changes frequently.
... We ran some experiments with a 100 nodes network (low latency). It takes approximatively 1 second for all nodes in the network to discover a topology change and recompute data routes. During this second, zenoh continues to operate, but some subscribers in the system may miss some data samples. An end-to-end reliability can be set up to overcome this problem.

Please emphasize this on your website more! I didn't see any of this on your website, would you be willing to post numbers, plots, etc.? This is the kind of information that engineers need to see when making decisions between technologies..

OlivierHecart commented 3 years ago

The zenoh.io clearly needs to be improved on a lot of aspects and dynamic networks support is obviously one of them. Note that we tend to publish figures and plots in the blog section of the site. You can already find some figures on discovery traffic there. We are about to publish a post with plots and figures of zenoh throughput and latency. And we plan to publish something soon about zenoh routing which should include figures on zenoh topology alignment figures.

In the mean time here are some figures from last April from the following scenario:

  1. Startup zenoh routers with a given network topology.
  2. Measure time until all nodes agree on the topology and computed their routing tables.
  3. Kill on of the routers.
  4. Measure time until all nodes agree on the new topology and recomputed their routing tables.
  5. Restart the killed router.
  6. Measure time until all nodes agree on the new topology and recomputed their routing tables.

This has been run on a local host so with very low latency. On a network with more realistic latencies, times will probably increase a bit. Again, note that during the alignment time, zenoh continues to operate and route data, but as the routes are not up to date some of the subscribers may miss some samples.

2021 04 07-zenoh-routing-convergence-figures 001

ckaran commented 3 years ago

@OlivierHecart Are you able to selectively break and bring up links between nodes in your simulations? While nodes sometimes come and go, it's much more common for the node to be operating as normal, but due to environmental constraints (multipath fading, etc.) some, but not all, radio communications links fade out. Being able to model the communications graph as a fully connected digraph with probabilities of delivery of packets across a given link being controllable over time will give you a better model of actual communications.

Also, I just finally read this blog post talking about ROS2 integration, which is the precise use-case that I was talking about earlier (I would like to have Zenoh be the communications backbone across a wireless network, assuming that would work).

OlivierHecart commented 3 years ago

@ckaran No, we can't do that in our simplistic simulation environment right now. But this is something we'll have to test and characterize as well.

ckaran commented 3 years ago

OK, I'm going to throw some unsolicited opinions at you :wink:

I did my Ph.D. work using my own custom-written network simulator after beating my head against the wall trying to get ns3 to do what I wanted it to do (this isn't a criticism of NS3, I just never was able to figure out how to get it to do what I needed for my work, and eventually gave up using it). Writing my own turned out to be a nigh-endless time suck; it took me about 8 years, 5 tries, and 4 different languages before I finally had something that worked just barely well enough (speed and accuracy) for me to graduate... and even then it still isn't good enough for something like Zenoh. The one thing that I really learned from the experience is that network simulation is hard to get right. If I had to do it all over again, I'd go for a hardware-in-the-loop option, using a bunch of cheap microcomputers that had built-in wireless with a custom app on it, and a bunch of people (in my case grad students) wandering around pretending to be robots. The downside to this approach is that it does require a group of people, but for the kinds of simulations I was after, it would have been the best way of doing things (multi-channel, multi-modal communications over rapidly churning networks).

Anyways, that's my unsolicited opinion. If you want more of it, I can do a complete brain dump on you, but it will fairly long.