eclipse-zenoh / zenoh

zenoh unifies data in motion, data in-use, data at rest and computations. It carefully blends traditional pub/sub with geo-distributed storages, queries and computations, while retaining a level of time and space efficiency that is well beyond any of the mainstream stacks.
https://zenoh.io
Other
1.49k stars 159 forks source link

Question: Implementation of sdn zenoh #122

Closed AlexandreVASantos closed 3 years ago

AlexandreVASantos commented 3 years ago

I am finishing my master's degree, where my final dissertation consists of using sdn tools to replicate, at least some of the zenoh routers behaviour, i.e. having a network of programmable switches that should behave like zenoh routers, the goal is to offload some of the features to the hardware. Since no zenoh router is in fact used, at least for now, I have to handcraft all the packets by myself and probably something is wrong. Right now I am able to establish sessions and everything works fine. I also have the hardware changing the packets that arrive from the publishers to match the ones that subscribers expect, however using the python zenoh clients available on github nothing is printed at the subscriber level and I cannot understand why. Asked for help in the gitter channel and was told to share my wireshark captures as a github issue. Can someone help me understand why my subscribers are not accepting data? Looking at the serial numbers, everything looks fine I guess.

zenoh_subscriber_capture.zip

kydos commented 3 years ago

@Mallets can you please take a look at this?

Mallets commented 3 years ago

@AlexandreVASantos can you please tell us which version/git revision of zenoh and zenoh-python are you using?

AlexandreVASantos commented 3 years ago

If I am not mistaken I followed the 0.5.0-beta 8 for zenoh and installed zenoh-python with pip3.9 install --no-cache-dir eclipse-zenoh==0.5.0-b8

Mallets commented 3 years ago

Just to confirm, the message format you implemented is the one in the branch 0.5.0-beta.8 on git? The message definition is here.

We did a few updates to the protocol on master since 0.5.0-beta.8 and I want to be sure that the same message format is used on both sides. Who generates the initial zenoh messages (before being modified by the router)? Is the zenoh-python publisher?

AlexandreVASantos commented 3 years ago

Yes that is right.

both publishers and subscribers generate the initial messages to establish the sessions. Then there is the sequence of initAck, open, openAck and declare messages. After the arrival of the declare message to the publisher, it starts to send frame messages with data messages inside.

In fact, as of now, data messages are not even touched, only the serial number of the frame message and the Ethernet, IP and udp fields. I could have missed some message after the subscriber declares which topic it is interested in.

Mallets commented 3 years ago

So, I had a look at the capture and it doesn't seem to have anything weird in the messages. Everything seems OK (but I might be proven wrong). Would you be able to post the logs of the zn_pub and zn_sub? You should execute the python programs with the following environment variable configured: RUST_LOG=debug.

AlexandreVASantos commented 3 years ago

Is there any way to send those logs to files instead of stdout?

Mallets commented 3 years ago

If you are in Unix-like OS, you can do that by redirecting the output like explained here.

Mallets commented 3 years ago

An additional request, please provide the logs for both RUST_LOG=debug and RUST_LOG=trace just in case.

AlexandreVASantos commented 3 years ago

files.zip

Here I present the captures and logs from the subscriber and publisher. Thank you for your help.

Mallets commented 3 years ago

I found the issue. When you modify the zenoh message you don't recompute the UDP checksum. Therefore, when your modified message arrives at the subscriber, the kernel drops the message because of the invalid checksum. You can verify that is the case by enabling the UDP checksum validation in wireshark.

So, when modifying zenoh messages make sure to also recompute the checksum(s).

AlexandreVASantos commented 3 years ago

AHHHH! Was so fixed on getting Zenoh working properly that I totally forgot about that! Well my fault for not having wireshark to validate checksum. Sorry about that and thank you so much for your help!

Mallets commented 3 years ago

No worries! And keep up with the good work :)