PX4 / PX4-Autopilot

PX4 Autopilot Software
https://px4.io
BSD 3-Clause "New" or "Revised" License
8.28k stars 13.43k forks source link

Attitude Controller hiccup in Manual mode #1956

Closed Zefz closed 9 years ago

Zefz commented 9 years ago

Something weird happens at the end of this flight log. Suddenly there is a gap of 0.5 seconds with no data points and the Attitude setpoints flat-lines which caused the copter to take a flip and crash.

Battery, signals strength and CPU load all seem normal to me. This happened while flying in Manual mode on master code (with a custom navigator app running in the background).

http://dash.oznet.ch/view/XkxhtaYiDkNk4BgupSUMvJ

kd0aij commented 9 years ago

The dropouts in telemetry might not indicate a problem with attitude control. roll

Zefz commented 9 years ago

Do you think this was caused by a hardware fault in one of my motors rather than a software bug?

kd0aij commented 9 years ago

Certainly roll rate is no longer tracking the setpoint. I just noticed that your throttle input goes to 80% and battery current goes to 50A before the loss of control (I think) wrong The flip occurs prior to this between 119.0 and 119.5 throttle

Zefz commented 9 years ago

Yes, we were stress testing the attitude controller after tuning the quadcopter. So high throttle input might have provoked the issue.

RomanBapst commented 9 years ago

I think the mixer scales down thrust in such as case

RomanBapst commented 9 years ago

Actually right now the mixer scales down everything if one motor command is greater than 1. We could also try some new implementations where just the thrust is reduced and the moments are not altered.

kd0aij commented 9 years ago

It looks like the high thrust and current values actually occur after impact with the ground... So I was looking in the wrong place. Sorry, I edited the my comment above.

kd0aij commented 9 years ago

One hypothesis that might fit is that there was additional control latency coinciding with the dropout in telemetry which resulted in the loss of stability. It looks like rollRate is tracking rollRateSP just prior to the flip: logdrop_flip

blankered commented 9 years ago

Could you possibly post your setpoint attitude script?

Zefz commented 9 years ago

http://dash.oznet.ch/view/XkxhtaYiDkNk4BgupSUMvJ#AS_Roll_PLOT

Zefz commented 9 years ago

@LorenzMeier @kd0aij Sorry, it seems I have lost the postflight performance log. I might have cleared the SD card after copying the flight log, because I cannot find it anymore.

Do we wish to keep this issue open? (i.e can we still find the source of problem even though we do not have more info)

RomanBapst commented 9 years ago

@Zefz Yes, please keep the issue open.

Zefz commented 9 years ago

I was able to reproduce the issue today. Flying in manual the attitude controller had another hiccup but luckly recovered before it crashed: http://dash.oznet.ch/view/Zt6SwWUsxhGkuwaXn4eAC7

Take a look at Pitch and Roll at about 12:00: http://dash.oznet.ch/view/Zt6SwWUsxhGkuwaXn4eAC7#AS_Roll_PLOT http://dash.oznet.ch/view/Zt6SwWUsxhGkuwaXn4eAC7#AS_Pitch_PLOT

I am wondering if this issue is related to the 3DR Telemetry module that is connected to Telem1 (I've had issues previously where the Pixhawk would hangup with telem module plugged in). Note that this time the bug occured in POSHOLD mode, so the issue is not exclusive to MANUAL mode.

Zefz commented 9 years ago

Postflight log:

PERFORMANCE COUNTERS POST-FLIGHT

sd write: 11343 events, 0 overruns, 36268824us elapsed, 3197us avg, min 9us max 163316us 7515.371us rms
mc_att_control: 9279 events, 0 overruns, 551416us elapsed, 59us avg, min 46us max 193us 29.287us rms
ekf_att_pos_reset: 0 events
ekf_att_pos_aspd_upd: 0 events, 0us avg, min 0us max 0us 0.000us rms
ekf_att_pos_baro_upd: 5632 events, 14842us avg, min 8744us max 27245us 5704.584us rms
ekf_att_pos_gps_upd: 418 events, 199515us avg, min 171134us max 225321us 9582.207us rms
ekf_att_pos_mag_upd: 9263 events, 9025us avg, min 8705us max 18133us 378.095us rms
ekf_att_pos_gyro_upd: 18560 events, 4505us avg, min 11us max 9275us 4493.473us rms
ekf_att_pos_est_interval: 9280 events, 9010us avg, min 8775us max 9288us 30.102us rms
ekf_att_pos_estimator: 9280 events, 0 overruns, 25802161us elapsed, 2780us avg, min 1193us max 3578us 427.747us rms
mavlink_txe: 0 events
mavlink_el: 15220 events, 0 overruns, 752910us elapsed, 49us avg, min 15us max 3702us 308.969us rms
mavlink_txe: 0 events
mavlink_el: 15129 events, 0 overruns, 1838456us elapsed, 121us avg, min 29us max 100328us 8217.138us rms
io latency: 9281 events, 0 overruns, 46016174us elapsed, 4958us avg, min 1860us max 8096us 1848.972us rms
io write: 0 events, 0 overruns, 0us elapsed, 0us avg, min 0us max 0us 0.000us rms
io update: 9281 events, 0 overruns, 7689442us elapsed, 828us avg, min 260us max 2480us 1002.121us rms
io_badidle  : 0 events
io_idle     : 21675 events
io_uarterrs : 0 events
io_protoerrs: 0 events
io_dmaerrs  : 0 events
io_crcerrs  : 0 events
io_timeouts : 0 events
io_retries  : 0 events
io_dmasetup : 21676 events, 0 overruns, 128843us elapsed, 5us avg, min 3us max 231us 11.870us rms
io_txns     : 21676 events, 0 overruns, 7264813us elapsed, 335us avg, min 135us max 787us 202.692us rms
sensor task update: 20866 events, 0 overruns, 1510124us elapsed, 72us avg, min 45us max 268us 138.730us rms
lsm303d_bad_values: 0 events
lsm303d_bad_registers: 0 events
lsm303d_accel_resched: 8580 events
lsm303d_mag_read: 8366 events, 0 overruns, 167932us elapsed, 20us avg, min 20us max 21us 0.393us rms
lsm303d_accel_read: 74747 events, 0 overruns, 2041738us elapsed, 27us avg, min 8us max 30us 9.844us rms
l3gd20_bad_registers: 0 events
l3gd20_errors: 0 events
l3gd20_reschedules: 20391 events
l3gd20_read: 82277 events, 0 overruns, 2101297us elapsed, 25us avg, min 8us max 32us 14.145us rms
mpu6000_reset_retries: 0 events
mpu6000_good_transfers: 83669 events
mpu6000_bad_registers: 0 events
mpu6000_bad_transfers: 0 events
mpu6000_read: 83671 events, 0 overruns, 4021942us elapsed, 48us avg, min 47us max 50us 0.430us rms
mpu6000_gyro_read: 0 events
mpu6000_accel_read: 0 events
ctrl_latency: 9283 events, 5 overruns, 4495786us elapsed, 484us avg, min 11us max 994us 405.882us rms
sys_latency: 0 events, 0 overruns, 0us elapsed, 0us avg, min 0us max 0us 0.000us rms
mpu6000_reset_retries: 0 events
hmc5883_conf_errors: 0 events
hmc5883_range_errors: 0 events
hmc5883_buffer_overflows: 9285 events
hmc5883_comms_errors: 0 events
hmc5883_read: 9285 events, 0 overruns, 8436556us elapsed, 908us avg, min 870us max 1859us 87.677us rms
adc_samples: 83680 events, 0 overruns, 224714us elapsed, 2us avg, min 2us max 3us 0.690us rms
ms5611_buffer_overflows: 5637 events
ms5611_comms_errors: 0 events
ms5611_measure: 7516 events, 0 overruns, 62508us elapsed, 8us avg, min 6us max 1885us 77.713us rms
ms5611_read: 7516 events, 0 overruns, 670737us elapsed, 89us avg, min 10us max 2914us 291.185us rms
DMA allocations: 1 events
kd0aij commented 9 years ago

At 58.34 seconds from the start of the log, ATT.RollRate and PitchRate begin to disagree with ATT.Roll and Pitch. This plot shows Roll, Pitch and the integrals for RollRate and PitchRate:

rateanomaly

Zefz commented 9 years ago

Could this be an estimator issue? Perhaps gyro noise induced by vibrations?

kd0aij commented 9 years ago

Since Roll is computed by integrating RollRate, a discrepancy between Roll and integral(RollRate) would indicate either that RollRate is being reported incorrectly, or the IMU is dizzy. Do you think the Roll values were actually correct? If so, there was a very abrupt change in attitude.

RomanBapst commented 9 years ago

Be careful with euler angles. Pitch is limited to [pi/2, -pi/2] and roll may be subject to gimbal lock (jump to +- pi)

RomanBapst commented 9 years ago

If you really want to be sure then you should compare the quaternions with the integration of the quaternion rate.

RomanBapst commented 9 years ago

@Zefz In which mode did you crash?

RomanBapst commented 9 years ago

To me the controller looks ok. The quad flipped over the pitch axis and the controller tried to prevent this long before it flipped. I think the problem could be related to the mixer. Some time before the vehicle flipped you applied 0.14 thrust which is very low. The mixer scales the roll and the pitch in case some actuator command is negative. Also you can see that OUT3 is fully saturated for a long time.

kd0aij commented 9 years ago

@tumbili Thanks; if I understand correctly, The Roll Euler angle became indeterminate as pitch crossed 90 degrees, and this explains the discrepancy I observed. As long as the controller is using quaternions or the full rotation matrix to represent attitude, this shouldn't be a problem.

RomanBapst commented 9 years ago

@kd0aij Yes, this is correct. BTW I uploaded another script to FlightAnalyzer which plots quads. You should check it out, I think it really helps a lot analyzing these logs. Right now I think its plotting at the full resolution so it's not real time any more but you can change that if you like.

Zefz commented 9 years ago

@tumbili The first log was in MANUAL mode and the second log was in POSHOLD mode. In both logs the quad abruptly flipped (first one crashed, the second one recovered after falling about 10 meters).

Max lean angles is 45 degrees.

LorenzMeier commented 9 years ago

@Zefz It looks like mixer saturation based on @tumbili 's comments. We will implement a different approach based on the observations in #1925. I just fixed a major bug in the EKF today which coupled accelerations in a really bad way. I would expect that the attitude estimation performance is significantly improved now.

kd0aij commented 9 years ago

That was indeed a major bug hidden in operator*. Thanks to cat888 for spotting this. Ideally, correct operation of overloaded operators would be verified in unit testing.

LorenzMeier commented 9 years ago

They are verified, just that the EKF has its own math lib. Its something I'm working to reconcile.

Zefz commented 9 years ago

It would be nice if we could use Eigen for this :)

LorenzMeier commented 9 years ago

Where are we standing with the unit test lib? I fixed 1-2 things today which were stemming from symbol aliasing in the EKF.

Zefz commented 9 years ago

The test is still failing in same place as we discussed last in #1931. I also think the tests need to be rewritten to verify that the operations produce correct results. I began implementing something but never finished it properly due to lack of time.

LorenzMeier commented 9 years ago

This will be addressed in #1925, closing.