ArduPilot / apm_planner

APM Planner Ground Control Station (Qt)
https://ardupilot.org
Other
512 stars 465 forks source link

Parameter corruption when using "Extended tuning" #1086

Open hsteinhaus opened 6 years ago

hsteinhaus commented 6 years ago

I just experienced a severe parameter corruption while configuring my copter with the "Extended tuning" tab. I did the following steps:

  1. Connected via telemetry (Laptop -> HC06 -> TBS Crossfire -> PixRacer) while copter was landed and close by.
  2. switched to "Extended Tuning" after a minute or so.
  3. seen some zero values in the tab, clicked the "Refresh Values" button. Zeros were replaced by actual values.
  4. made changes to Stabilize P values, but did not touch any of the other fields!
  5. clicked "Write params"
  6. Tried to fly and wondered
  7. downloaded logs and compared params, see below

This is what arrived on the drone (APM log extract):

2017-12-22 13:07:11.65: PARM {TimeUS : 521390301, Name : ATC_ANG_YAW_P, Value : 5.68450021744}
2017-12-22 13:07:11.65: PARM {TimeUS : 521390316, Name : ATC_ANG_YAW_P, Value : 5.68450021744}
2017-12-22 13:07:11.67: PARM {TimeUS : 521409930, Name : ATC_RAT_PIT_D, Value : 0.00300000002608}
2017-12-22 13:07:11.67: PARM {TimeUS : 521409945, Name : ATC_RAT_PIT_D, Value : 0.00300000002608}
2017-12-22 13:07:11.70: PARM {TimeUS : 521437428, Name : ATC_RAT_PIT_I, Value : 0.0799999982119}
2017-12-22 13:07:11.70: PARM {TimeUS : 521437443, Name : ATC_RAT_PIT_I, Value : 0.0799999982119}
2017-12-22 13:07:11.72: PARM {TimeUS : 521457475, Name : ATC_RAT_PIT_IMAX, Value : 0.439999997616}
2017-12-22 13:07:11.72: PARM {TimeUS : 521457491, Name : ATC_RAT_PIT_IMAX, Value : 0.439999997616}
2017-12-22 13:07:11.74: PARM {TimeUS : 521477447, Name : ATC_RAT_PIT_P, Value : 0.0799999982119}
2017-12-22 13:07:11.74: PARM {TimeUS : 521477462, Name : ATC_RAT_PIT_P, Value : 0.0799999982119}
2017-12-22 13:07:11.89: PARM {TimeUS : 521624923, Name : ATC_ANG_YAW_P, Value : 5.68450021744}
2017-12-22 13:07:11.89: PARM {TimeUS : 521624938, Name : ATC_ANG_YAW_P, Value : 5.68450021744}
2017-12-22 13:07:11.91: PARM {TimeUS : 521642173, Name : ATC_RAT_PIT_P, Value : 0.0799999982119}
2017-12-22 13:07:11.91: PARM {TimeUS : 521642192, Name : ATC_RAT_PIT_P, Value : 0.0799999982119}
2017-12-22 13:07:11.93: PARM {TimeUS : 521669891, Name : ATC_RAT_RLL_D, Value : 0.00300000002608}
2017-12-22 13:07:11.93: PARM {TimeUS : 521669906, Name : ATC_RAT_RLL_D, Value : 0.00300000002608}
2017-12-22 13:07:11.95: PARM {TimeUS : 521689936, Name : ATC_RAT_RLL_I, Value : 0.0799999982119}
2017-12-22 13:07:11.95: PARM {TimeUS : 521689951, Name : ATC_RAT_RLL_I, Value : 0.0799999982119}
2017-12-22 13:07:11.97: PARM {TimeUS : 521710081, Name : ATC_RAT_RLL_IMAX, Value : 0.439999997616}
2017-12-22 13:07:11.97: PARM {TimeUS : 521710097, Name : ATC_RAT_RLL_IMAX, Value : 0.439999997616}
2017-12-22 13:07:12.24: PARM {TimeUS : 521979983, Name : ATC_RAT_RLL_I, Value : 0.0799999982119}
2017-12-22 13:07:12.24: PARM {TimeUS : 521979998, Name : ATC_RAT_RLL_I, Value : 0.0799999982119}
2017-12-22 13:07:12.26: PARM {TimeUS : 521997389, Name : ATC_RAT_RLL_IMAX, Value : 0.439999997616}
2017-12-22 13:07:12.26: PARM {TimeUS : 521997405, Name : ATC_RAT_RLL_IMAX, Value : 0.439999997616}
2017-12-22 13:07:12.29: PARM {TimeUS : 522022204, Name : ATC_RAT_RLL_P, Value : 0.0769999995828}
2017-12-22 13:07:12.29: PARM {TimeUS : 522022219, Name : ATC_RAT_RLL_P, Value : 0.0769999995828}
2017-12-22 13:07:12.31: PARM {TimeUS : 522041920, Name : TUNE, Value : 0.0}
2017-12-22 13:07:12.31: PARM {TimeUS : 522041935, Name : TUNE, Value : 0.0}
2017-12-22 13:07:12.33: PARM {TimeUS : 522069784, Name : TUNE_LOW, Value : 0.0}
2017-12-22 13:07:12.33: PARM {TimeUS : 522069799, Name : TUNE_LOW, Value : 0.0}
2017-12-22 13:07:12.58: PARM {TimeUS : 522318043, Name : TUNE_LOW, Value : 0.0}
2017-12-22 13:07:12.58: PARM {TimeUS : 522318058, Name : TUNE_LOW, Value : 0.0}
2017-12-22 13:07:12.60: PARM {TimeUS : 522338721, Name : VEL_XY_P, Value : 0.0}
2017-12-22 13:07:12.60: PARM {TimeUS : 522338736, Name : VEL_XY_P, Value : 0.0}
2017-12-22 13:07:12.63: PARM {TimeUS : 522363110, Name : WPNAV_SPEED, Value : 741.0}
2017-12-22 13:07:12.63: PARM {TimeUS : 522363125, Name : WPNAV_SPEED, Value : 741.0}
2017-12-22 13:07:12.65: PARM {TimeUS : 522383618, Name : WPNAV_SPEED_DN, Value : 0.0}
2017-12-22 13:07:12.65: PARM {TimeUS : 522383633, Name : WPNAV_SPEED_DN, Value : 0.0}

The problem is present for at least a year or two, but I was never able to track it down. Thanks to https://github.com/ArduPilot/ardupilot/issues/3778 APM now logs any param change, which helped to track down this issue.

I consider this problem as extremely severe - damage to property or injury could easily occur when the tab is used during flight.

Arne-W commented 6 years ago

All right - first which firmware version do you have on your flightcontroller and which version of APM-Planner are you using? And just to be sure - the corruption you mean are the params which are set to zero?!

hsteinhaus commented 6 years ago

Yes, I am talking about the zero values on the log shown above.

Currently, I am running tests with ArduCopter 3.6-dev, however the problem persists for more than a year (used versions of ArduCopter: 3.3, 3.4 and 3.5). I have also seen P I and D values set to 0 and had a few minor crashes on takeoff as a result. Unfortunately, ArduCopter did not log param changes when disarmed normally, so I never understood where the zero values came from, up to today. APM-Planner is version 2.0.25 now, but I am pretty sure that this is unrelated and the bug is very very old. Platform is Fedora 26 now on amd64. APM planner has been compiled from source locally.

Is there any good reason why params that have not been edited in the GUI are written back to the drone?

Arne-W commented 6 years ago

Hmm - I will have to look at the code. I did a lot of coding regarding log analysis, but never looked at the code which handles the parameters. At the moment I would say there is NO good reason for writing back all parameters - but if you do not know which param has changed the easiest is to write them all back. Perhaps it is part of the solution to remember which param was modified. We will see.

hsteinhaus commented 6 years ago

I think there are at least three aspects of this problem:

  1. why are there any params shown as zero, requiring a manually triggered re-read (maybe a race condition between form initilization and initial param fetch?)
  2. why is a param zero internally after it has been presented as non-zero (after the manual re-read - another race?)
  3. the unnecessary write-back
Arne-W commented 6 years ago

I did some debugging and could not reproduce point one nor point 2 while connected with an USB-Cable. Even when connected with my sik radio (wireless rs232) I cannot reproduce 1 or 2. Could it be somehow CRSF related? The unnecessary write back only happens on some float / double values due to rounding issues

Arne-W commented 6 years ago

@hsteinhaus Does this problem still exist?? As I am not able to reproduce this issue I will close it in the next days.

hsteinhaus commented 6 years ago

I cannot reproduce 1 or 2. Could it be somehow CRSF related?

Well, CRSF is a rather slow link, so this could definitely affect the timing. However, a unrleiable and slow link is IMHO the standard case when communicating with UAVs during flight, so the implementation should be extra-cautious here to defend against possible race conditions. I therefore think that a USB cable is a very bad test setup here, and some sik radios closeby to each other as well. May be it helps to limit the air data rate of the SiKs to something like 9600 baud or so to simulate some link congestion?

Does this problem still exist??

Sorry, I did not do a single param write with a vehicle in the air since discovering this bug and I am very reluctant to try it again before there is a very good indication that the problem is fixed. I think it is only a question of time until a more vital param (like some PID gains) is affected, which will definitely crash the vehicle in a split second.

Arne-W commented 6 years ago

@hsteinhaus Ok I will do some more testing with a slowed down SiK link - perhaps I can reproduce when reducing transmission speed and power - we will see

Arne-W commented 6 years ago

@hsteinhaus as there were some changes in the extended tuning page and parameter name adaptions it would be cool if you could test if this issue still exists. I was never able to reproduce it. I forgot to mention that those changes are only on current master so you can only test if you can build the planner.

hsteinhaus commented 6 years ago

ok, will have a look. Building master is not a problem.