ArduPilot / ardupilot

ArduPlane, ArduCopter, ArduRover, ArduSub source
http://ardupilot.org/
GNU General Public License v3.0
10.17k stars 16.74k forks source link

ArduPilot Failsafe uniformization #3772

Open lvale opened 8 years ago

lvale commented 8 years ago

ArduPilot failsafe document from RFC opened at drone discuss

Proposal for a common method of interpreting failsafe actions between vehicles controlled by ArduPilot.

First, which events that happen on the vehicle should trigger a failsafe event:

Fuel/Battery status

Loss of RC radio link

Loss of GCS communication

Vehicle moving outside predetermined area (fence)

Vehicle moving inside predetermined area (no-fly zone) ` Event on the vehicle code that would trigger a MAVLINK_SEVERITY_EMERGENCY message

GPS/EKF failsafe

Terrain following database bad status Motor failure (see https://github.com/diydrones/ardupilot/issues/3456)

Then, which actions should occur and when, if an event is triggered.

Not all events should imply a flight termination, and some events don’t require an immediate action. Also, some events might occur almost simultaneously and/or sequentially so, the failsafe routine must be able to categorize the actions.

Fuel/Battery status: We monitor the Fuel/Battery and the user should preset appropriate values on which specific actions would be triggered.

So a first level event would be fuel/battery capacity remaining to execute a RTL.

Second level event would be fuel/battery capacity remaining to Land at current position.

Third level event complete fuel/battery depletion

Loss of RC radio link, where we monitor the status of a specific radio channel (usually throttle)

Short term loss of control link

Long term loss of control link

Regaining control link after long term loss of control link

Regaining control link after short term loss of control link

Loss of GCS radio link, where we monitor the Mavlink heartbeat

Short term loss of GCS link

Long term loss of GCS link

Regaining GCS link after long term loss

Regaining GCS link after short term loss

Vehicle moving outside predetermined area (fence)

Vehicle approaching fence Vehicle speed and direction will require an acceleration that would exceed pre-determined max acceleration.

Vehicle breaching fence

Vehicle moving inside predetermined area (no-fly zone)

Vehicle approaching no-fly zone Vehicle speed and direction will require an acceleration that would exceed pre-determined max acceleration.

Vehicle breaching no-fly zone.

Event on the vehicle code that would trigger a MAVLINK_SEVERITY_EMERGENCY message

Currently there are no events that trigger this message but would lead to an immediate flight termination

Specific SYS_STATUS flag/flags that require a failsafe action (GPS/EKF failsafe?)

Currently on Copter, the actions for EKF failsafe are Land, Alt Hold, or forcible LAND even on Stabilize.

I believe Plane switches to DCM when EKF fails, so it wouldn’t be registered as a failsafe action.

Terrain following database bad status

If the terrain database is not present for the current location. Should follow what Plane does.

Now we must consider appropriate failsafe responses. These responses would be current or future actions and would be invoked. The responses should be ordered in a way that the most severe response would be Immediate Flight Termination, and the least severe would be “ignore event, continue current action”.

Ignore, continue current action Switch to Stabilize, Manual or FBWA mode Stop movement, enter Loiter or Altitude Hold mode (Circle?) Go to nearest Rally Point or Home Go to nearest Rally Point or Home AND Land Land at current location Terminate flight. Deploy parachute and landing gear Terminate flight immediately (Kill switch)

The above actions are prioritised, so if a subsequent event happens that requires a less important action, it should be ignored, and if a more important response is required than it should be obeyed.

Now, what happens if a failsafe event is cleared (regain RC link, GCS link, clear fuel/battery status)

If the current failsafe action leads to movement from the vehicle (move to nearest Rally Point or Home or higher priority action) then the action should be cancelled and the vehicle enter the least disruptive action (Stop movement, enter Loiter or Altitude Hold mode (Circle?))

All higher priority actions are not to be interrupted until completion (ie if the Action is “kill” the vehicle, than invoking that action means that there is no way out…)

Just like we have today on Copter the Kill switch, where the human operator has a 5 seconds “oops window” to revert that action, also all failsafe could be overriden within the first x seconds after the event, and resume control (obviously that only applies when there is control either via RC or GCS)

lvale commented 8 years ago

Apparently Plane also has the AFS parameters, but I'm not a plane user to completely understand them fully but I believe they can be integrated on the structure above.

lvale commented 8 years ago

Added a spreadsheet where failsafe events can be linked to appropriate actions that the user could choose. There are 16 possible actions and 16 possible failsafe events.

[failsafe events and actions.xlsx]https://docs.google.com/spreadsheets/d/1BniczUablwT73Jl0fjnbxmOwLYKnODYeTTTl542UeYs/edit?usp=sharing

screen shot 2016-03-20 at 02 13 17
WickedShell commented 8 years ago

First comment is really that the plane/copter handling of these really should be different on some actions. IE plane you should never just switch into manual/fbwa/stabilize unless the user is expecting it (which by definition on a failsafe they won't be) or you have no RC input in which case it will use FBWA to just glide straight ahead. Circle doesn't use GPS for the position and shouldn't be used automatically (except maybe if GPS is lost?).

Plane doesn't have any support for just trying to land at the current location (parachute is the closest) but setting up a landing requires mission config, as well as tuning the aircraft for it (which I'd guess a good percentage of people have never done).

As far as returning to the mission or stopping in place when the failsafe event is over, I think that would really need to be a parameter (In general if you are mid RTL the least disruptive action is to continue the RTL as it is predictable, and a known safe spot).

That said, none of what I listed is an objection to try and bring common behavior to the platforms, but some things that copter can do aren't available on plane, just due to the nature of the vehicles and the setups, so plane will have to opt out of them.

The only real objection I have at the moment to the proposed list is that if I want to override from the GCS or RC control I should always be able to do that. (Especially on plane where hitting manual you're safety pilot could execute a safe recovery).

Is the proposal to allow the user to select which actions are enabled per feature? (IE loss of RC link I want to continue the mission if in auto until the long timeout happens, but if in manual or FBW etc I want to go RTL immediately (which btw is not a configuration that can be done at the moment)). Or say that I never want my plane to do the term action (except for this one event) and that it should always do a RTL + Land.

Biggest thing AFS offers are some parameter ways to terminate the aircraft if properties are exceeded. (It also offers some GPIO output pins for monitoring the state of the aircraft externally).

The other thing I'd really like to see out of this is if we could work out a better failsafe reporting mechanism to the GCS. I want to know why the plane is failsafe'ing (its not always obvious especially if you lose the text string), and it's not always clear when the situation has resolved itself. It's also not uncommon for multiple things to hit together when you have a problem, and seeing which failsafe's are still active can be hard at the moment. (Ideally this is just a mavlink packet that is streamed out while mid failsafe)

lvale commented 8 years ago

@WickedShell Like I said above, Plane is a "strange bird" for me :) so the table above has a slight bias towards copter :)

The failsafe actions are generic actions (most, if not all, already exist on the code) that the user would choose. The X on the table would be a guide for available actions to choose from when configuring via GCS, and each vehicle would have it's own defined set, and last but not least a default set applied.

If you look at the table above the failsafe actions are somewhat ordered from least severe to more severe, and the failsafe events from left to right so that the logic would be easy to grasp for users.

The multi level failsafe table would be used for the short event and long event, so the example you give of loss of RC link could be easily programmed.

The recovery from failsafe events is something that should be based on a common (between vehicles) set and applied to each vehicle type as appropriate.

iskess commented 8 years ago

I have a request to control which geographical direction a plane turns during an RC failsafe event. I need to fly back and forth along a steep cliff face which rises above the aircraft's altitude. I never want to turn into the mountainside. The mountain range isn't always tangential to the home location, so that it is possible that a turn into the mountain would be the shortest arc. I don't want to use altitude terrain following, but perhaps in a failsafe event the plane can never turn in the direction of terrain that is equal or above the airplane.

lvale commented 8 years ago

@iskess Rally Points can't be used on your scenario ?

iskess commented 8 years ago

Maybe if there was a way to force the plane to fly home after it hits the rally point. I don't want it loitering out there. I'm still not really assured the plane won't turn into the cliff side.

lvale commented 8 years ago

Usually plane users are the most tech savvy of the group using ArduPilot. This proposal won't be enough to satisfy everyone, but could help normal "Joe" user.

The case you mention would require flight planning and configuring the failsafe options to appropriate actions, not excluding RTL height ;)

Jman841 commented 8 years ago

The two issues I have run into with Arducopter that I think can be avoided is a sudden loss of the GPS Module (Physically fell off due to the double sided tape coming loose) and high vibration issues. Both resulted in a full on crash, however, I think can be dealt with. With the GPS issue, if GPS drops to 0 and the second compass goes crazy, the copter should just go to altitude hold mode IMHO and possibly wait X amount of seconds, then go to land mode.

For the accel clipping issues from high vibrations, could we revert to using just the Gyro for stabilization of the quadcopter and again switching to Altitude Hold mode. Without the accel it might be difficult to detect exactly when the copter has touched down on a landing, however, going to a landing based soley on the barometer and then maybe slowly letting the throttle down after it reads 0 just incase there is some drift seems to be a much better solution than simply loosing attitude control and it crashing to the ground.

R-Lefebvre commented 8 years ago

The AHRS cannot maintain level on Gyros alone for very long unfortunately. It would be interesting to see how long it could but... there's no free lunch here.

khancyr commented 8 years ago

Hi ,

I think another failsafe should be companion computer failsafe. I used to fly with CC relying on ROS to make navigation and avoidance. But it could do weird think, got sync lost etc. So in the future we will need some mechanism to be sure that the link between autopilot and CC is in good state and that orders are coherent

rmackay9 commented 8 years ago

Yes, I agree with @khancyr

proficnc commented 8 years ago

100% agree

WickedShell commented 8 years ago

Not opposed at all, but how is a companion computer failsafe different then a gcs failsafe? Would they not be handled and detected identically? On Mar 25, 2016 2:05 AM, "proficnc" notifications@github.com wrote:

100% agree

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/ArduPilot/ardupilot/issues/3772#issuecomment-201211998

khancyr commented 8 years ago

For now it is the same, but I hope we can make a distinction between the two in near future (I think px4 team already did it) When using CC, I pretty much dont care about losing telemetry on my GCS since my CC provide redundancy control over autopilot (or add more bugs ;-P ).

Another lower criticity failsafe should be equipement : landing gear, gimbal, . User should be a least warn and if it is vital for mission, a failsafe or action should be require.

rmackay9 commented 8 years ago

It makes sense to have a separate CC failsafe if the CC if critical for the proper functioning of the vehicle. So for example, in my red-balloon-popper, it would have been good to know if the CC was failing to send heartbeats (which is was on one run at AVC). So the failsafe, like other existing failsafes, would both prevent arming and also cause the vehicle to RTL (or some other predefined action) if it was triggered. I think this is separate from the GCS failsafe which (once modified to work more intuitively) will trigger when the ground station (and thus it's operator) have lost contact with the vehicle.

lvale commented 8 years ago

AS it is now, a Companion computer is a version of GCS, so it is covered, specially when using a multilevel failsafe as proposed.

If things evolve, then, the companion computer should/must implement its own set of rules including failsafe procedures.

lvale commented 8 years ago

@marcmerlin Doc label ??

auturgy commented 5 years ago

No reason to close. It’s a valid enhancement request.