espressif / esp-now

A connectionless Wi-Fi communication protocol
Apache License 2.0
477 stars 90 forks source link

Question about improving many-to-many robots communication (AEGHB-388) #92

Closed kingsimba closed 5 months ago

kingsimba commented 9 months ago

I am using ESP-NOW to share information between robots, including planned routes, robot footprint, current intention, etc. I am targeting 30 or more robots spread across a large factory. However, for a single robot, there are no more than 3-5 robots nearby.

Currently, I am exclusively using "broadcast packets" and experiencing some packet loss. I have come up with some solutions that may reduce packet loss. Can you give me some suggestions?

Plan A: Use the broadcast channel to discover nearby robots only and then switch to another channel for communication.

Plan B: Develop a packet assembler.

My question is, will both plans reduce packet loss? Are there any technical errors in my plans?

nielsnl68 commented 9 months ago

inho I would not use broadcast option to much. I have thought about this issue as well for my NowTalk solution. What i finally setup with is the following solution.

I hope this give an other idea

lhespress commented 9 months ago

@kingsimba I agree with @nielsnl68 , you also can reference the links: https://github.com/espressif/esp-now/blob/master/src/control/src/espnow_ctrl.c which will store a bind device list.

kingsimba commented 9 months ago

Thanks @lhespress @nielsnl68. Yes, I will work on Plan A first. And I can use the broadcast packets(contain coordinates) to sort robots by distance. Only the nearest 20 robots should be added to the list because of the limitation of peer list.

nielsnl68 commented 9 months ago

Good, in your mesh network how do you think you will prevent of looping packages between robots. That is something i havent figure out myself. @lhespress do you know how esp-mesh is handling this?

kingsimba commented 9 months ago

Sadly, after further investigation, it seems that keeping a peer list has no advantage over simple broadcasting.

Here is how I conducted the test:

To me, it appears that esp_now_add_peer serves only as a simple filter, providing no added value in my specific use case.

bhuvanchandra commented 9 months ago

@kingsimba

The limitation could be with the signal strength and the channel congestion or both. Have you tried with different TX power and different channels too?

At the moment I'm also benchmarking the espnow transfers, a full payload at 100Hz, I do see some packet losses, but yet to quantify them. I observed sometimes the ESPnow TX callbacks are not being called at the expected period. At the moment I'm synchronizing the espnow send with the task that publishes the data from the queue. At times, I see the esp now callbacks are not being called. As a workaround added a 50ms timeout on the lock before giving up the tx. This is not consistent though, if I try different channels the behavior is different. I'm not yet sure, if espnow send will retry if the packet is not sent properly. (from what I recall it is not the case but not 100% certain).

I'm planning to implement a dynamic channel selection (DCS) strategy by using another ESP device to constantly monitor the channel congestion and broadcast the ideal channel on all channels and the device listening to this broadcast will switch the channel. There could be edge cases like what if the board casted message is not received by all devices. there could be ways to overcome it by monitoring the device's channel.

Said that ideally, the frequency hopping feature could be the solution. I see it in the TODO list in readme, but not sure when it will be picked.

kingsimba commented 9 months ago

@bhuvanchandra, we called setTxPower with WIFI_POWER_MINUS_1dBm. I'm not sure which option represents the maximum TX power.

WiFi.mode(WIFI_STA);
WiFi.setTxPower(WIFI_POWER_MINUS_1dBm);

How do you broadcast ideal-channel on all channels? By calling (esp_wifi_set_channel)? I think switching channel may be a slow operation. And different parts of the site may have different ideal-channels.

BTW, I know a company, in a successful project, in a very large area(several square kilometers), they used self-deployed 5G stations to relay realtime 720p video to a central server. So, if budget is not a problem, I think 4G/5G booster is a better solution.

bhuvanchandra commented 9 months ago

@kingsimba

we called setTxPower with WIFI_POWER_MINUS_1dBm. I'm not sure which option represents the maximum TX power.

-1dBm is too low right, are you setting it low on purpose? maybe you can try a little higher power.

How do you broadcast ideal-channel on all channels? By calling (esp_wifi_set_channel)?

I chose channel 11. Not on all channels.

I think switching channel may be a slow operation. And different parts of the site may have different ideal-channels.

Not sure, I haven't yet explored the dynamic switching of the channels at runtime. I think it shouldn't be too slow, but the only thing is that all devices have to be on the same channel to communicate, so making all devices switch the channel at the same time could be tricky.

lhespress commented 9 months ago

@nielsnl68 What do you mean mesh network? Is it esp-mesh-lite?

nielsnl68 commented 9 months ago

@nielsnl68 What do you mean mesh network? Is it esp-mesh-lite?

yes, for example esp-mesh-lite or the as it is now named ESP-WIFI-MESH. would something work over esp-now?

kingsimba commented 9 months ago

Thank you for pointing out esp-mesh-lite as a potential solution. I read its document but struggling to understand a few things:

  1. What will happen if there is no external network and router. Will a root node still be selected, and nodes can communicate with other nodes with UDP packets? Because in my use case, I just want robots to talk to each other.
  2. What will happen if all nodes have the same configuration, but a group of nodes are far away from another group of nodes. Will they form two trees(TreeA and TreeB)?
A(root) <-- B <-- C                           O --> P --> Q(root)

If a node N (robot) moves between the two groups, will it connect to TreeA or TreeB automatically? Are the following are all possible results?

A(root) <-- B <-- C <----- N              O --> P --> Q(root)
A(root) <-- B <-- C <------- N < -------- O <-- P <-- Q
      A --> B --> C ---------> N -------> O --> P --> Q(root)
A(root) <-- B <-- C              N -----> O --> P --> Q(root)

Overall, I think esp-mesh-lite is over-complex and may not be the best solution for robot-to-robot communication.

kingsimba commented 9 months ago

Progress update:

We successfully run 21 robots in the same site. And the communicate quality is acceptable.

image

Every robot is broadcasting ESP-NOW packets at 2 hz. Observed from one of the robot, the receive rate is :

PEER_SN          TIME                        RATE
2183308802241A9, 2023-10-09T16:22:20.923342, 2.0
2183309802293AZ, 2023-10-09T16:22:21.104580, 2.0
2183309802296B2, 2023-10-09T16:22:20.723776, 2.0
2183309802297B3, 2023-10-09T16:22:20.663290, 1.7
2183309802298B4, 2023-10-09T16:22:21.134507, 2.0
2183309802302B8, 2023-10-09T16:22:20.963699, 2.0
2383308702214zI, 2023-10-09T16:22:20.653701, 1.3
2383308702221zP, 2023-10-09T16:22:20.753078, 1.7
2383309702286AS, 2023-10-09T16:22:20.993066, 2.0
2383309702287AT, 2023-10-09T16:22:20.873013, 1.7
2383309702288AU, 2023-10-09T16:22:20.793163, 2.0
2383309702289AV, 2023-10-09T16:22:21.043088, 1.7
2383309702290AW, 2023-10-09T16:22:20.653419, 1.3
2383309702291AX, 2023-10-09T16:22:20.803450, 2.0
2383309702292AY, 2023-10-09T16:22:20.735580, 2.0
8881307202088xG, 2023-10-09T16:22:20.724328, 2.0
8881307202102xU, 2023-10-09T16:22:20.903126, 2.0
8882305401962vE, 2023-10-09T16:22:20.623103, 1.7
8981307a02180za, 2023-10-09T16:22:20.772997, 2.0
nielsnl68 commented 9 months ago

Is the above working with the esp-mesg-lite or esp-now?

lhespress commented 9 months ago

@nielsnl68 It's better to posted a placeholder request for open discussion in the esp-mesh-lite.

kingsimba commented 9 months ago

Is the above working with the esp-mesg-lite or esp-now?

It's with ESP-NOW broadcast packets (without adding peers).

lhespress commented 5 months ago

@kingsimba @nielsnl68 Closing this issue since there has been no update on this. Please feel free to reopen if required.