Open GrahamDrive opened 3 months ago
I did some research on this on my own, and I thought I'd document what I've gathered thus far.
The link Graham provided brings you to the On-Board-Computer repository, but there is also
They are using UART to communicate between subsystems
The ADCS repo actually contains a B-Dot detumbling algorithm and a pointing algorithm using a sun-sensor here and here.
Here are some points I've gathered about their On-Board-Computer (OBC):
mass_storage
and is defined in the ECSS Services repo.cubeMX/
subfolder contains two STM32 projects: disco/
and obc/
. disco/
appears to be some sort of testing ground for various things. obc/
appears to be the flight-ready code.main.c
file. Every other file appears to be part of a third-party library.obc_data
is a giant struct containing all the state of OBC. It's NOT defined in the OBC repo though. Instead it's defined in the ECSS repo here.Similar to TSAT, UPSAT OBC is using task scheduling with RTOS to perform all of its functions. There are five separate threads/tasks:
UART_task
(function definition): This task listens for incoming messages on the UART bus. There are four different message queues: EPS, COMMS, ADCS, & DBG. These queues are checked in an infinite loop, and it appears as though OBC is just relaying the messages back to UART, possibly so that the subsystems can communicate between one another.HK_task
(function definition): "HK" actually stands for Housekeeping (not the best naming scheme tbh) and this is another service defined in the ECSS repo. At a high level, this task simply initialises the Housekeeping service with hk_INIT
defined in ECSS here, and then calls hk_SCH
defined here in an infinite loop. From a surface viewing, it's really hard to tell what HK is actually doing. More research may be needed to make sense of it.IDLE_task
(function definition: Again, it's not clear what this one is doing. More research is required.SU_SCH
(function definition): Same as abovesche_se_sch
(function definition): Calls cross_schedules
in an infinite loop, which is part of the scheduling_service
service in ECSS. It's again not clear what it's doing, but it may have to do with executing "APIDs" whatever those are.I'll probably be investigating their error handling practices next, so I'll post my findings when I do.
I did some research on their error handling practices. I mostly looked at the ADCS and ECSS repositories since it actually looks like the OBC repo doesn't contain any error handling at all!
Here's what I jotted down:
Error_Handler
function, which might make sense since it's not clear what the right corrective action might be if HAL throws an error. It does seem a little unfortunate that nothing is logged though./**
* @brief This function is executed in case of error occurrence.
* @param None
* @retval None
*/
void Error_Handler(void)
{
/* USER CODE BEGIN Error_Handler */
/* User can add his own implementation to report the HAL error return state */
while(1)
{
}
/* USER CODE END Error_Handler */
}
assert_failed
function which gets called by HAL when a parameter to a function is invalid. Again, no logging is performed for such occurences/**
* @brief Reports the name of the source file and the source line number
* where the assert_param error has occurred.
* @param file: pointer to the source file name
* @param line: assert_param error line source number
* @retval None
*/
void assert_failed(uint8_t* file, uint32_t line)
{
/* USER CODE BEGIN 6 */
/* User can add his own implementation to report the file name and line number,
ex: printf("Wrong parameters value: file %s on line %d\r\n", file, line) */
/* USER CODE END 6 */
}
SAT_returnState
Defined here in ECSS.
SAT_returnState
is an enum type for errors totalling at 57 possible error values (including SATR_OK
). It's not the only enum for errors, but it appears to be the one that's used within the ECSS library.
adcs_error_status
Defined here in ADCS.
adcs_error_status
is the enum type for ADCS errors. There are 10 total possible values. The errors don't appear to be particularly specific.
error_handler
Defined here in ADCS.
This gets called in the program's main loop once for each cycle of the TIM7
timer. This is accomplished by setting ADCS_event_period_status
to TIMED_EVENT_NOT_SERVICED
in the ISR for TIM7
, and then checking every iteration of the while loop if it's set. Once it is set and the if statement is triggered, the flag gets reset at the end after error_handler
is called.
int main(void) {
// ... init code ..
while (1) {
/* GPS update */
error_status = update_gps(&gps_state);
/* Control loop runs at 68ms, interrupt runs every ~1.2s, WDG at ~2.4s */
if (ADCS_event_period_status == TIMED_EVENT_NOT_SERVICED) {
// ... code ...
/* Update flag */
ADCS_event_period_status = TIMED_EVENT_SERVICED; // reset condition to false.
/* Software error handler runs for actuator and sensor 230ms*/
adcs_sysview_print();
error_handler(error_status);
error_status = ERROR_OK;
}
}
}
The error handler function performs different operations based on the value of error_status
.
error_propagation
Defined here in ADCS.
This function takes in a parameter current_error
which is of type adcs_error_status
.
current_error
is ERROR_OK
(i.e. no error is passed), then the function defaults to returning error_status
(which is ERROR_OK
by default at the start of the program).current_error
is not ERROR_OK
and that is returned instead.
This creates a behaviour where error_status
acts as a second-priority error code in case the calling function was successful.In the below code sequence found in the main
function, you'll see that if any one of these function calls fails, the rest will also fail with the same error code.
error_status = init_mem(); // let's say this fails with ERROR_FLASH
error_status = increment_boot_counter(); // then this will return ERROR_FLASH too (even if the operation was successful)
error_status = init_obc_communication(adcs_boot_cnt); // same as above
Note that in the second line, if increment_boot_counter
fails with something else, that error code will become the new secondary error code. This actually means errors can get overwritten and lost, and they won't be rectified when error_handler
eventually gets called. This is a pretty bad way to deal with errors.
There is actually a humerous bug with increment_boot_counter
. There is a case where it returns FLASH_ERROR
when it should return ERROR_FLASH
. This is an example of why having a good naming scheme is important.
FLASH_ERROR
is defined in adcs_flash.h
and is NOT part of the adcs_error_status
enum. See below:
typedef enum {
FLASH_ERROR = 0, FLASH_NORMAL
} flash_status;
And of course the code still compiles because C allows you to mix enums even though that is completely dangerous to do.
Honestly, the code in UPSAT is really bad. It's full of inconsistent naming schemes and mistakes. It's hard to believe this code made it to space.
I didn't realize there was so much information in their Git repository. The FAT system is particularly intriguing. I've been considering whether we should use one or not; although I don't have experience with them in embedded systems, I assume it would make our work significantly easier though.
Thanks for this fantastic deep dive, Logan!
UPSAT appears to be storing a lot of different types of data (I believe I even saw some scripts being loaded), whereas TSAT just needs to store telemetry data. So I think it's acceptable if not desirable that we don't have one.
Reverse Engineer A Satellite Codebase
CDH member @whatdoes3plus1equalsto has found some open-source satellite projects that we could benefit from reverse engineering and analysis of their code bases. There are three parts to this task that I will list here.
Reverse engineer the given satellite codebase: Take a deep dive into the satellite's codebase. Try to determine what type of code they have written things like "how do they control peripherals?", "Are they using a real-time operating system? If they are what type of tasks do they have and how are they organized" things like that. Also, take extra care to evaluate their error correction code like CRC checks on their memory and other corruption preventative measures.
Identify modules that could improve our design: Now that you have your footing in the codebase and how it operates, try to identify some useful parts that are missing from our codebase that we could implement in our satellite. Once you find some bring it up with @GrahamDrive or @DaighB and we can discuss how it could be used.
Finally some coding!: Now that you know what you want to add to the satellite you can try making a test project for a proof of concept. Go ahead and create a brand new project for your dev board and try to implement the given feature you have decided on. Have fun with it play around and add your ideas and flair.
Show off your work: Now that you have completed the code and tested it you can show it off in a meeting!
University of Patras
This task will be regarding the UPSat CubeSat from the University of Patras in Greece. It is completely open source and good for it uses a stm32 microprocessor, its codebase can be found here. UpSat also has its own Wikipedia page that could have a ton of valuable information you could use the link to it is here.
As always if you have any questions don't hesitate to ask!