bitcraze / aideck-gap8-examples

Examples on how to use the GAP8 on the AI-deck
50 stars 53 forks source link

Wifi image streaming hangs, when executed multiple times. #106

Open flrnhbr1 opened 1 year ago

flrnhbr1 commented 1 year ago

I am running the crazyflie and the AI deck on the latest firmware and try to use the image streaming in an application.

When I first connect the ai deck wifi to my computer and start the opencv-viewer.py everything works fine and I can access the streamed images. However, when I do this a second or a third time, so stop the python script and start it again, it hangs at the following step:

packetInfoRaw = rx_bytes(4)

In rx_bytes it looks like that there is nothing received over the socket and the loop

    while len(data) < size:
        data.extend(client_socket.recv(size - len(data)))

just runs infinite.

When I power off the cf and then restart it again and newly connect the wifi it works again, but also only one or two times until the same problem again occurs.

I tried with another AI deck but this doesn't changed the behavior.

I flashed the AI deck with the streamerMode = JPEG_ENCODING option.

Do you maybe know what is the problem here?

Thank you very much in advance!

whoenig commented 1 year ago

I often encounter this problem as well, independent of the chosen encoding. In my (limited) testing this seems to be related to the camera itself to be freezing, in which case the capture task simply never finishes (the official GAP8 SDK driver for the camera has forever loops, rather than timeouts in several places).

flrnhbr1 commented 1 year ago

Ok at least it seems to be not a problem of my setup. Did you find any solving for that or workaround to improve the behavior?

It really frustrates me to always restart and reconnect.

whoenig commented 1 year ago

On the hardware level it seems to be related to the power management somehow. So fresh batteries help sometimes. But generally, we also go the route of frequent rebooting, unfortunately. It is possible to reboot wirelessly, which eases the pain a little bit.

matthewoots commented 1 year ago

I will link another issue here too https://github.com/bitcraze/crazyflie-firmware/issues/1205 The wifi streamer may hang as shown below

[cf_opencv_publisher.py-1] [INFO] [1674708122.508129651] [cf_streamer_cf1]: Connecting to socket on 192.168.4.1:5000
[cf_opencv_publisher.py-1] [INFO] [1674708122.516404205] [cf_streamer_cf1]: Socket connected
[cf_opencv_publisher.py-1] Traceback (most recent call last):
[cf_opencv_publisher.py-1]   File "/home/nvidia/ros_workspace/crazyswarm2_ws/install/apriltag_ros/lib/apriltag_ros/cf_opencv_publisher.py", line 133, in <module>
[cf_opencv_publisher.py-1]     main()
[cf_opencv_publisher.py-1]   File "/home/nvidia/ros_workspace/crazyswarm2_ws/install/apriltag_ros/lib/apriltag_ros/cf_opencv_publisher.py", line 117, in main
[cf_opencv_publisher.py-1]     chunk = rx_bytes(length - 2, client_socket)
[cf_opencv_publisher.py-1]   File "/home/nvidia/ros_workspace/crazyswarm2_ws/install/apriltag_ros/lib/apriltag_ros/cf_opencv_publisher.py", line 45, in rx_bytes
[cf_opencv_publisher.py-1]     data.extend(client_socket.recv(size-len(data)))
[cf_opencv_publisher.py-1] socket.timeout: timed out
krichardsson commented 1 year ago

Thanks for the input!

hmllr commented 1 year ago

I am not sure if it helps - but I did play a bit with the face detection example in the last days and even though I can't really explain all my observations, they led to some workarounds. I noted that a) a lower cluster frequency somehow helps the camera to not get stuck anymore (like 75MHz or even 50MHz to be very sure) b) bright images work all the time, only the dark ones lead to failure (??? I had some fun seeing the direct fail after almost punching the Crazyflie to darken the image, as it was a quite reliable obstacle detection, but yeah, not the intended usage) c) I get more failures before WiFi is connected d) maybe most important - I can recover almost all the times with resending the camera start command.

You do find code for the camera cmd resend in the face detection pull request I just pushed: https://github.com/bitcraze/aideck-gap8-examples/pull/113

Let us know if this helps you in some way, Hanna

whoenig commented 9 months ago

I tested this today with main/master of everything (nrf, stm, esp, gap8). It seems like the behavior is even worse than before. It starts well and then the connection freezes about 30s later (verified with 2 AI decks that used to work ok). I am typically also connected via the cfclient (so https://github.com/bitcraze/crazyflie-firmware/issues/1205 applies), although I didn't notice any ESP disconnects. Note that I have used streaming+radio connection successfully about a year ago, although with a relatively low fps (see also https://github.com/bitcraze/aideck-gap8-examples/issues/137).

I tried the suggestions from @hmllr, especially the workaround for start/stopping the camera if needed. In my experiments that didn't help much. The additional prints helped me to narrow it down to the GAP8 being stuck at the cpxSendPacketBlocking call. Unfortunately, a non-blocking version is not implemented, see https://github.com/bitcraze/aideck-gap8-examples/blob/master/lib/cpx/src/cpx.c#L97-L99.

It would make sense to do these long/infinite flight tests with an ai deck streaming data to get this very basic feature reasonably reliable.

knmcguire commented 9 months ago

I've added this to the triage meeting list. This week we are too occupied but we hope for in 3 weeks we will have an opportunity to discuss this.

wdliu356 commented 2 weeks ago

Is there any update? I also found the camera stuck at data.extend(client_socket.recv(size - len(data))). Especially on bolt when connected with esc's and motors.

knmcguire commented 2 weeks ago

No update I'm afraid. We couldn't find a fix for this, but it is interesting to see that on the bolt the problem is worse. Perhaps it is due to the deck not getting enough supply due to that...