Open bruceadams opened 11 years ago
Hmm. Have you already fixed this with 22721edef7ee2ef30d5c3f9e73980826adac5add?
Maybe. The trouble is that I have trouble simulating the noisy wifi environment. I only really saw it at Lambda Jam in the hotel venue. I had bumped the timeouts during the conf and it seemed to help some, but I didn't try the reconnects. If you have any ideas, let me know.
The issue at Steel City Ruby didn't seem to be the data stream freezing because the data stream continued to update on the terminal. The problem seemed to be that drone was not recognizing the video target and I'm not sure why that was happening.
Oh wait. Just had a thought as I was typing.
The drone camera was pointed toward the audience with an open set of windows in the background. It might be possible that the glare from the windows was preventing the drone from seeing the target. I've seen that happen at the node copter event in Edinburgh.
@jimweirich Ooh, yes, I can easily imagine daylight coming through those huge windows would be a problem for the camera.
I was thinking of a different problem. For a little while, you had the screen refreshing with navigation data, then it stopped updating. You kicked it (metaphorically speaking) to get the data refresh going again. That looked similar to my (vague) understanding of what @gigasquid was fighting at Lambda Jam.
I'm back to looking at this. In my house (city location, moderately noisy WiFi environment), I find that receiving nav data often stops after about two seconds (both on an older checkout and on current master). I'm trying to figure out how learn more about what is happening. Did you see this issue during #cincycopter yesterday? (By the way #cincycopter sounded like a blast!)
Yes, I saw it at cincycopter. I was just smoke testing a new version of Argus, so I didn't investigate further. So both the clojure and ruby libraries are having problems here.
Bruce, What code are you running that is freezing? I agree that there is still a problem here. Most of the time when I see it, it manifests itself in a SocketTimeout exception when I am trying to read the navigation data.
I've been running a slight variant of your nav_test with more debug output.
Also, I added time stamps to log messages, see pull request #4 . (I'm barely able to read a log file without time stamps in it.)
The other thing I've been fiddling with, which may be causing spurious failures, is removing the call to communication-check. It's past flight time in my household (quiet hours), so I can't double check that right now.
Ah yes - you do need the communication check in there. If you look at the navigation data - there are two settings that get sent back
:communication :ok, :com-watchdog :ok
If you don't send a command to the drone in a certain amount of time - then the com-watchdog will be reported a as problem. If the problem is uncorrected, the communication will drop. The communication check sends a com watchdog reset if there is a problem and keeps the communication going.
I had a bug in the code earlier when I was working with mulit-drones and it didn't send the communication watch dog properly and saw the same thing. Current master should have the problem corrected.
Oops. Yes. Turning the communications reset back on corrects the problem I was seeing. Sorry for my false diagnosis.
There is a related thing I've been struggling to understand.
I was confused into doubting that the :reset-watchdog command was having any impact.
Looking harder at the navigation data, I can see that the "Watchdog Reset" messages stop for about 0.2 seconds after each command is sent to the drone, well, except for the :reset-watchdog command itself.
I gather this library does not maintain a steady stream of outgoing communications to the drone? (I haven't found any code that looks like it does.)
Is there a way to do a simple "takeoff" (a command that takes several seconds for the drone to complete) without triggering the :problem message from the drone?
From the drone docs:
If you want to have a steady stream of commands without using the watchdog reset, you can try using the
(drone-do-for seconds :take-off)
It repeats a command every 30 ms for as long as you define.
Thank you.
I learned from Jim's presentation that the drone wants periodic (and fast) communication. In learning this code, I didn't understood the watchdog reset as a normal thing, especially when I saw it happening every 0.01 seconds. It looked like failure recovery, leading me to struggle to figure out what normal was.
I had thought about changing out the communication model to something else - I don't know if it would help or not.
Right now it is just using plain on Java Sockets. I could use the java.nio.channels stuff or even use aleph which is build on top of it.
First, I want to say again how much I enjoyed your presentation at LambdaJam.
One of the challenges you faced at LambdaJam was a noisy WiFi environment, which (if I understood what you were saying correctly) caused the collection of navigation data to freeze. Just a couple days ago, I saw @jimweirich have a similar problem during his presentation at Steel City Ruby.
In large part inspired by your LambdaJam talk, I now have a AR Drone and have been experimenting with using this library. I'd like to figure out what is happening and then see if I can enhance the library to tolerate glitches.
Do you or @jimweirich have any insights into what is happening or how to better handle whatever is happening?