USSTRocketry / MiniRockets

Making rockets that hopefully go UP!
MIT License
1 stars 11 forks source link

RFC 202301181937: Resume state on software crash #34

Closed frroossst closed 1 year ago

frroossst commented 1 year ago

The rocket if the software crashes should resume in the state it was last in, or implement a recovery mode

frroossst commented 1 year ago

Abstract

If the rocket's software crashes, how do we ensure it picks up or resumes the state machine from the position it left it in. We cannot restart as if software crashes on ballistic descent, and we restart from ground idle then we'll never transition forward and ever reach parachute deploy, due to how the parameters in the state machine handle transitions.

Stakeholder(s)

Avionics

Problem

If the rocket's software crashes, how do we ensure it picks up or resumes the state machine from the position it left it in. And to resume and ensure a successful flight.

Solution(s)

Store the state in the FRAM

Store the state in a single byte on the FRAM after every transition and pickup state from where it crashed.

Enter a software recovery or safe mode of sorts

On watchdog timeout simply enter a safe mode where the rocket enters a recovery or safe mode where it only focuses on the essential software tasks, such as deployment of the recovery system, and nothing more, may send a panic!() radio signal to the ground station.

Difficulties and Risks

Store the state in the FRAM

The potential problem with this is if the software crashes after a state transition but before a successful FRAM write then we're stuck in a state from which we cannot get out of. Another problem is that we may not have a solid I2C bus to the FRAM, so we may not be able to write to it or loose it mid-flight. This also puts a single point of failure on the FRAM.

Enter a software recovery or safe mode of sorts

This is a more complicated solution, but it would allow us to recover from a software crash and resume the state machine from where it left off. This would require a lot of work and testing to ensure that we can recover from any state and resume the state machine from where it left off. We need to ensure that the recovery system doesn't have a single point of failure, for example, what would we do if the barometric sensor(s) fail. Refer to RFC: Sensor redundancy

Estimated costs and timelines

There should be no cost associated with this, as it is a software solution.

Timeline for solution 1 is a couple of days at best, while solution 2 may require a couple of weeks as we haven't yet been able to configure the watchdog timer to trigger a reset on time.

Proof of Concept

Not Applicable

Testing and Robustness assurance

test, Test, TEST, T E S T !

References

Not Applicable

casey-SK commented 1 year ago

You need to be careful to ensure the correct state is entered. Gotta make sure you don't ignite the rocket motor at an unsafe time just because the software previously crashed. If the program crashes, I would probably not attempt to recover and put it in "super safe mode" where no code is allowed to execute, period.

The solution to this problem is have good coverage and anticipate all errors.

frroossst commented 1 year ago

When the rocket first boots up, it enters an "unarmed stage" where it does nothing, literally nothing except wait for a radio signal from the ground telling it to arm itself, after it is armed, it can fire rocket motors and then it transition to powered flight -> unpowered flight -> ballistic descent -> parachute deploy -> land safe.

As for the rest, I can get you on speed tomorrow! :)

frroossst commented 1 year ago

Already implemented software_recovery_mode in 28928d774e22ca764ae01d6f633a490f8b51f9ee

snowy-shadow commented 1 year ago
void setup
{
rocket_state = eeprom.read(0x00)
}

void loop()
{
stuff
}

for each rocket statechange, write to the eeprom as in integer. Instead of void recover(), just get state from the eeprom everytime you run setup which would be run when the board crash

snowy-shadow commented 1 year ago

https://www.pjrc.com/teensy/td_libs_EEPROM.html teensy 4.1 have 4284 bytes of eeprom