RizkiWahyupratama / ardupilot-mega

Automatically exported from code.google.com/p/ardupilot-mega
0 stars 0 forks source link

difficult to connect to UDP - unless connected to USB first (a strange bug) #526

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
This is VERY strange: - most likely a bug of some sort.
I am using Dronecell to do an UDP connection to APM planner.

Connection fails while getting parameters, it starts very fast, then suddenly 
gets very slow, then timeouts- see attachments.

However - If I connect using APM's USB port - then disconnect.
THEN connect UDP, - it will connect !!  - very strange.

Another strange thing. If I reboot APM (not Dronecell) - while connected to 
Mission Planner - it'll work just fine after rebooting.  ("proving" USB 
connection does not "enable" some parameters/values to be transferred.)

Please look into this problem - I would gladly test/send my UDP traffic to you 
to help you troubleshoot.

Original issue reported on code.google.com by andre.kj...@gmail.com on 25 Feb 2012 at 1:16

GoogleCodeExporter commented 8 years ago
forgot to mention: arduplane 2.28 , APM Planner 1.1.42

Original comment by andre.kj...@gmail.com on 25 Feb 2012 at 1:20

GoogleCodeExporter commented 8 years ago
can you post the tlog of the connection? it looks like it missed one paramater.

Original comment by Meee...@gmail.com on 25 Feb 2012 at 11:21

GoogleCodeExporter commented 8 years ago
Please see the attached logs.
Since it failed at different steps (different values) you may like more then 
one or two logs.  The screenshots should provide a pointer as to what time to 
look for.

Thank you.

Original comment by andre.kj...@gmail.com on 26 Feb 2012 at 9:26

GoogleCodeExporter commented 8 years ago
i looked at the logs and cant tell the diffrence between them. which one was 
udp, and which one was usb?

Original comment by Meee...@gmail.com on 29 Feb 2012 at 6:24

GoogleCodeExporter commented 8 years ago
hi, if you see multiple failed connections in a row, - USB would be the first 
success -and  UDP would then be the next successful connection.
- please say so if you wish to do a TCP connection TO it, or get UDP data sent 
to you - if our timezone is not too different, that might be the best way of 
finding the problem -?

Original comment by andre.kj...@gmail.com on 29 Feb 2012 at 8:07

GoogleCodeExporter commented 8 years ago
Please see 4 attached logs - *all* are only UDP - and all slows down at 
SRO_EXTRA3 .. then may get a few more values before failing.

Original comment by andre.kj...@gmail.com on 29 Feb 2012 at 6:34

GoogleCodeExporter commented 8 years ago
can i also get the planenr log. should be in planner directory called 
ardupilotmegaplanner.log

Original comment by Meee...@gmail.com on 1 Mar 2012 at 1:17

GoogleCodeExporter commented 8 years ago
attached. thank you.

Original comment by andre.kj...@gmail.com on 1 Mar 2012 at 1:28

GoogleCodeExporter commented 8 years ago
BTW: I see from the log, that a lot of values are requested at 10Hz, I do not 
know the data amount, but maybe you think GCS is saturating the GPRS/EDGE link 
? 

Anyway: the same link worked reliably after connected to USB first, and for 
next 30 minutes in air.  

I do not know if there's any chance of overloading GPRS with 57600, depends on 
how many slots are used I guess, but it does not explain why it fails to 
connect - even with a huge amount of lost packets some information should get 
there.

Original comment by andre.kj...@gmail.com on 1 Mar 2012 at 1:37

GoogleCodeExporter commented 8 years ago
in the config> planner> stream rates   you can change it from 10 hz to whatever 
you want.

looking at the planner log file, it could be the link flooding, as it connects 
ok, then apears to stop after requesting the streams at 10 hz. so try chaning 
to 1 hz for all streams to test.

Original comment by Meee...@gmail.com on 1 Mar 2012 at 10:54

GoogleCodeExporter commented 8 years ago
Will test. 
The only thing that makes absolutely no sense, is that it's likely to connect 
after USB link was up (not always, but very likely to connect.)

- BTW: return traffic over UDP works too - (I've changed values) - but is there 
any extra logic ? : Does APM sends packet with change confirmation, or Mission 
Planner reads back the value later to verify that it's been changed ?
Is there any chance that use may *for-example* add waypoint, but only longitude 
was changed - while latitude packet is lost ?`- or are there safeguards against 
such errors.

Original comment by andre.kj...@gmail.com on 2 Mar 2012 at 7:00

GoogleCodeExporter commented 8 years ago
all mavlink packets are checksumed. bad packets get dropped.
when a param is sent.
1. planner sends
2. apm reponse with param and value.
3. planner stores the return value for use.

as for your wp question. no, the packet would be dropped, if any part of it was 
invalid.

Original comment by Meee...@gmail.com on 2 Mar 2012 at 7:35

GoogleCodeExporter commented 8 years ago
ok, - I thought there might be some chance a packet might contain parts of the 
MAVlink command (UDP packet might contain other MAVlink data, when it gets to 
the byte or time limit, and gets sent.) 
so a waypoint change is not in danger or being divided into logitude, latitude, 
altitude, and there's no chance of any part of this information it being lost 
or corrupted in such a way that *some* data gets there.  

Also - is there any chance of sending waypoint 1+2 - and data for WP 1 are 
lost/corrupted, and only data for WP2 gets there - would the UAV execute go for 
WP2 directly (-possibly crashing into a mountain?)

Thanks for clarifying the sanity check/logic regarding this - I've searched for 
this info for a while - but there were no info on how much additional verifying 
MissionPlanner /APM does.

Original comment by andre.kj...@gmail.com on 2 Mar 2012 at 7:57

GoogleCodeExporter commented 8 years ago
I started at 3Hz, then went for 1Hz on all telemetry settings.
no success, connection fails after getting plenty of values.. like before.
Please see attached logs.
(none of the logged action was with USB. - everything is UDP)

Original comment by andre.kj...@gmail.com on 2 Mar 2012 at 5:53

GoogleCodeExporter commented 8 years ago
Michael- I've narrowed down the problem:
It has nothing to do with USB first - but the USB resets APM.

I verified that:
Low update rate does not influence the problem.
I can recieve the UDP stream just fine - if connected soon after/on APM boot.
If I disconnect, and reconnect - it will fail.
If I press reset button on APM - and then reconnect - it will connect, and work 
fine.

So - how can we pinpoint the difference between fresh connection, and 
"reconnect" ?
(there's no GPS FIX in none of these tests)  - what else could possibly be 
different ?

Original comment by andre.kj...@gmail.com on 2 Mar 2012 at 11:13

GoogleCodeExporter commented 8 years ago
on every log
param 49 gets dropped.
can you connect via usb adn see what it is?

log 
2012-03-02 18:37:48,376  INFO ArdupilotMega.MAVLink - 376 got param 48 of 172 
name: THR_MIN (C:\Users\hog\Documents\Visual Studio 
2010\Projects\ArdupilotMega\ArdupilotMega\MAVLink.cs:685) [3]
2012-03-02 18:37:48,798  INFO ArdupilotMega.MAVLink - Mavlink Bad Packet (crc 
fail) len 5 crc 17999 pkno 0 (C:\Users\hog\Documents\Visual Studio 
2010\Projects\ArdupilotMega\ArdupilotMega\MAVLink.cs:2102) [3]
2012-03-02 18:37:48,813  INFO ArdupilotMega.MAVLink - lost 190 pkts 195 
(C:\Users\hog\Documents\Visual Studio 
2010\Projects\ArdupilotMega\ArdupilotMega\MAVLink.cs:2126) [3]
2012-03-02 18:37:48,813  INFO ArdupilotMega.MAVLink - 813 got param 50 of 172 
name: ARSPD_FBW_MAX (C:\Users\hog\Documents\Visual Studio 
2010\Projects\ArdupilotMega\ArdupilotMega\MAVLink.cs:685) [3]

Original comment by Meee...@gmail.com on 3 Mar 2012 at 7:24

GoogleCodeExporter commented 8 years ago
i think the apm might be triggering something on your modems? when a specific 
byte sequence. as its seems very odd

Original comment by Meee...@gmail.com on 3 Mar 2012 at 7:34

GoogleCodeExporter commented 8 years ago
From logwith USB:

2012-03-03 10:05:56,256  INFO ArdupilotMega.MAVLink - 256 got param 49 of 172 
name: ALT_HOLD_FBWCM (:0) [ProgressReporterDialogue Background thread]

Very odd indeed.
The modem is in transparent mode - ONLY responding to a second long pause, then 
+++ then 500ms pause.  
IF - the modem decided that it got 1000ms+++500ms it would exit to command 
mode, nothing more would be sent.    I had flights with it - and - failed 
connection does not require modem setup again, only reset of APM. 
Plus: when failing to connect - the modem continues to transmit as usual - 
another connection attempt will again get most of the values fine.

So we may exclude the option that the modem itself exits transparent mode.

Original comment by andre.kj...@gmail.com on 3 Mar 2012 at 9:13

GoogleCodeExporter commented 8 years ago
i dont think i have any other ideas.
1. usb works, so the planner does work.
2. its the same packet being droped everytime.
3. you have isolated it to apm resets, what if you put the requested rates at 
0. and try connecting.

Original comment by Meee...@gmail.com on 4 Mar 2012 at 12:29

GoogleCodeExporter commented 8 years ago
to set all rates to 0, use the planner, but also, set the params SR0_* and 
SR3_* to 0 as well

Original comment by Meee...@gmail.com on 4 Mar 2012 at 12:57

GoogleCodeExporter commented 8 years ago
I did as you say.
set all rates to 0 , then all SR0_* and SR3_* to zero over USB.
Tried to connect using UDP
It still stops for extra long time at at SRO_EXTRA3 and times out soon after, 
on this, or next values.

Please observe- that I've been flying and recording long logs, both yesterday  
(attached in case 
http://code.google.com/p/ardupilot-mega/issues/detail?id=537&q=compass)  and 
today - The very strange thing is that I can only connect if APM have been 
recently rebooted, regardless of the telemetry rate settings.
The GSM module is not rebooted at all, in fact, it's up long before APM - 
because I use another circuit to configure it. 

As I see it, it must be a difference in what APM is transmitting shortly after  
boot, and what it is transmitting later.  - OR - if there is a difference how 
APM planner is verifying/checking the data while it establishes a new 
connection (pulls the data for the first time) - and later on.

Original comment by andre.kj...@gmail.com on 4 Mar 2012 at 12:20

GoogleCodeExporter commented 8 years ago
I've not seen the problem lately - I must assume you have fixed it somehow. 
Using default telemetry rates is no problem.

Original comment by andre.kj...@gmail.com on 24 Mar 2012 at 6:54

GoogleCodeExporter commented 8 years ago

Original comment by Meee...@gmail.com on 4 May 2012 at 6:33