PX4 / PX4-Autopilot

PX4 Autopilot Software
https://px4.io
BSD 3-Clause "New" or "Revised" License
8.41k stars 13.46k forks source link

Random thrust when no thrust input given and optical flow enabled (but not sending data) [Bug] #23034

Closed dirksavage88 closed 1 month ago

dirksavage88 commented 6 months ago

Describe the bug

On recent main (early April), when optical flow is enabled with EKF2_OF_CTRL but no optical flow or distance sensor is given, while in Position mode the vehicle will experience a surge in thrust that will ignore set-point despite exceeding the thrust set-point given.

To Reproduce

  1. Enable GPS or VIO as a position source & have valid position fed through sensors
  2. Enable barometer with sys_use_baro, but set primary height to vision or GPS (not baro)
  3. Enable optical flow as position source, disable range aid fusion
  4. Attempt flying in position mode and increase thrust gradually until random thrust occurs
  5. Attempt to switch out of position or kill switch as soon as infinite height drift occurs and thrust set-points are ignored

Expected behavior

Despite optical flow sensor/range sensor data not being fed into EKF2, the random thrust occurrence should not happen and thrust setpoints should not be ignored. I have also observed this behavior when range sensor and optical flow is being fused but the rangefinder data fails a condition in the estimator checks (bypassed by setting range fusion to enabled instead of conditional)

Screenshot / Media

ignore_setpoint random_thrust

Flight Log

https://logs.px4.io/plot_app?log=ca0d60d8-6dc6-4215-86f4-32fa2ad18f71

Software Version

Main 1.15.0 apha

Flight controller

VOXL2

Vehicle type

Multicopter

How are the different components wired up (including port information)

No response

Additional context

No response

AlexKlimaj commented 5 months ago

I saw the same thing today.

It looks like the thrust setpoint starts dropping even though I have the throttle at center. Local Z position starts diverging, then everything goes nuts.

https://review.px4.io/plot_app?log=feaa27ad-73c3-4f2e-a5da-950ee5c1920b

image

dirksavage88 commented 5 months ago

@mrpollo is there a way to get better visibility on this issue to the community and add it to the list of known blockers for the next release?

mrpollo commented 5 months ago

The best way to raise the visibility of an issue is to bring it up for discussion on one of the weekly calls.

I have also added this issue to the PX4 Misc project board.

AlexKlimaj commented 5 months ago

This is a strange one. Like the fused height estimate is slowly diverging. And I couldn't get it to go into pos mode. Then alt mode would just drop https://review.px4.io/plot_app?log=f086836a-91be-4c4c-a5bc-d702c3b657f6 I think its related to our thrust bug. But its less severe when you have gps This one I took off in pos mode, it shot up, then recovered quickly. But I have gps now. https://review.px4.io/plot_app?log=3f9bfaac-143f-497a-ad2f-167b518f82b9

DronecodeBot commented 5 months ago

This issue has been mentioned on Discussion Forum for PX4, Pixhawk, QGroundControl, MAVSDK, MAVLink. There might be relevant details there:

https://discuss.px4.io/t/px4-sync-q-a-may-8-2024/38626/1

mrpollo commented 5 months ago

Hey @bresch can you please take a quick look here?

MaEtUgR commented 5 months ago

It's not really a random thrust. As long as the vehicle controlls the altitude automatically e.g. not Stabilized/Manual or Acro mode then it is controlling the thrust based on the velocity error and in this log the velocity estimate completely diverges. image If the controller is told that the vehicle is suddenly falling it has to give thrust.

AlexKlimaj commented 5 months ago

It looks like a bug with the altitude estimate diverging when baro is set as the main EKG height source.

AlexKlimaj commented 5 months ago

We consistently ran into this today. If you don't try to compensate and increase thrust on the RC, it doesn't result in a 100% thrust. If you try and compensate on the RC it will cause the huge thrust spike.

One resulted in a flyaway. https://review.px4.io/plot_app?log=f0220dd8-a7e9-4dcf-bc35-241a58fc0ac5

https://review.px4.io/plot_app?log=4d0b3cf9-003d-4370-9e7f-b6e5eec17543

It looks like a spike in the baro causes a spike in dist_bottom. image

This drone only has one baro, why are there two estimator_baro_bias topics with topic 0 only briefly published at the beginning. image

AlexKlimaj commented 5 months ago

image

dirksavage88 commented 5 months ago

I have even seen this issue with baro enabled in EKF2 but not as the primary height reference. Upon disabling barometer I do not see the issue, however this is not ideal for higher alt flights (outside of rangefinder max range, or range aid conditions not met due to velocity) with spotty GPS

AlexKlimaj commented 5 months ago

GPS + baro + RNG https://review.px4.io/plot_app?log=3169d126-0696-43b2-88af-1fdd32c4a3f0

Video of the bug from this flight https://www.youtube.com/shorts/BBo1AWEbH1M

GPS +RNG. Baro control disabled. https://review.px4.io/plot_app?log=d0f2ea06-3c93-4dc3-92af-be4a000883a7

GPS + Baro. RNG control disabled. Unable to replicate. https://review.px4.io/plot_app?log=3cf0f6a7-ce16-47d2-99fa-4527477a0172 https://review.px4.io/plot_app?log=6c46e2ca-61ee-4f9e-abb0-56871f294752

dakejahl commented 5 months ago

Probably something in this PR? It's really the only thing that's changed and is consistent with the timeline of the bug report https://github.com/PX4/PX4-Autopilot/pull/22770/commits

@bresch A few weird things I see, maybe you have some insight on what's happening

dirksavage88 commented 5 months ago

@dakejahl I believe the issue might have been introduced even before February timeframe since I think I remember seeing it 1.14, however would be good to verify.

AlexKlimaj commented 5 months ago

Backported my Pi6X to 1.14 I can't replicate https://review.px4.io/plot_app?log=02011abe-6647-4b6f-a7fc-c56f65f796f9

release/1.15 branch I am unable to replicate as well https://review.px4.io/plot_app?log=bd39dfb4-6d18-48ab-b386-09ae3f26f35e

dirksavage88 commented 5 months ago

I haven’t replicated the issue with pure gps + range aid ( no barometer + no optical flow enabled): https://review.px4.io/plot_app?log=370a4be3-5c26-4678-a4e7-40d1c9dd3fa6

This setup has no baro. ark flow onboard but I only use the rangefinder for this particular log.

It looks like rangefinder in combo with another hgt ref can be problematic (especially baro, to a lesser degree VIO and GPS)

also gps as a height reference seems to make a big difference (maybe less noticeable divergence)

AlexKlimaj commented 5 months ago

So the EKF selector is switching between the two estimator local positions causing these big jumps in vehicle_local_position. In main, this results in an immediate change trying to achieve the location but in release/1.15 its not as bad.

https://review.px4.io/plot_app?log=8260af89-85f7-454d-b491-ddc022b44719

image

bresch commented 5 months ago

Maybe we're mixing two issues here, but after checking the first log posted here, I don't see anything strange, the VIO data is diverging an pulling the EKF with it. Since the VIO is "falling", the EKF falls with it and the controller reacts by increasing the thrust. @dirksavage88 In the description you mention that you're not using range data, but you have the VIO height and vertical velocity fused (see EKF2_EV_CTRL). Screenshot from 2024-05-23 10-16-37

bresch commented 5 months ago

@AlexKlimaj Do you have the same issue when flying with a single EKF instance?

dirksavage88 commented 5 months ago

Maybe we're mixing two issues here, but after checking the first log posted here, I don't see anything strange, the VIO data is diverging an pulling the EKF with it. Since the VIO is "falling", the EKF falls with it and the controller reacts by increasing the thrust. @dirksavage88 In the description you mention that you're not using range data, but you have the VIO height and vertical velocity fused (see EKF2_EV_CTRL). Screenshot from 2024-05-23 10-16-37

Yes, this is probably a bad example for rangefinder ctrl but highlights that the barometer is useless when VIO “falls”. These height references really need to be reconciled and there needs to be regression testing of height control and optical flow every time EKF code changes are made.

AlexKlimaj commented 5 months ago

It looks like when rangefinder is enabled, the Z and VZ from different estimator instances are wildly different.

Without Rangefinder enabled in the EKF. image

With Rangefinder enabled in the EKF. image image

AlexKlimaj commented 5 months ago

Hm I did increase EKF2_RNG_DELAY to 105ms based on what I measured. Maybe that explains why the two EKF instances Z's are diverging.

AlexKlimaj commented 5 months ago

Reset the baro and rangefinder delay params back to default and its much better. No EKF instance switches. This is on release/1.15

https://review.px4.io/plot_app?log=0939ee41-6143-4401-ba77-5da8b82f0f9b

https://review.px4.io/plot_app?log=e5e935be-be4b-49eb-96b2-a98649c1b4f7

AlexKlimaj commented 5 months ago

@dirksavage88 It looks like EKF RNG CTRL is turned off in your log. Try turning it on and resetting the delay and gate params to defaults.

image

AlexKlimaj commented 4 months ago

https://review.px4.io/plot_app?log=19ee3d98-919c-4fb0-8d83-852f3d354099

It looks like this is still in 1.15, even with default params. Z estimate goes bad.

image image image

dirksavage88 commented 4 months ago

I'm waiting on some 3d printed parts to retest this with the ark flow, but that is sad news to hear.. :(

DronecodeBot commented 2 months ago

This issue has been mentioned on Discussion Forum for PX4, Pixhawk, QGroundControl, MAVSDK, MAVLink. There might be relevant details there:

https://discuss.px4.io/t/px4-sync-q-a-aug-21-2024/40268/1

mrpollo commented 1 month ago

Hey everyone, we haven't seen any recent reports of this problem, I'm going to be closing this issue since it was marked as a release blocker. I'll keep paying attention to community reports to make sure this doens't come back, if it does we can re-open.

DronecodeBot commented 1 month ago

This issue has been mentioned on Discussion Forum for PX4, Pixhawk, QGroundControl, MAVSDK, MAVLink. There might be relevant details there:

https://discuss.px4.io/t/px4-sync-q-a-sep-4-2024/40534/1