PX4 Flight Stack Structure Proposal

Stifael commented 7 years ago

As in any large project, it is important to not only add new features and fix legacy code, but also to have a common goal in mind in terms of software architecture to make better judgments about future improvements. In addition, it is more than overdue to work on a software diagram of the PX4 project that represents the current and future architecture.

The link below shows a first a example of a software architecture that would simplify future enhancement greatly. It is to note that the diagram given below is a proposal that is not considered to be finished nor static, but rather the PX4 community is encouraged to actively participate in finding a common goal. Proposal: https://docs.google.com/drawings/d/1Uw1PmU3hqEfWR3et9KT7zWggDX9wqoAiCt5suiWdnlw/edit

Current: https://docs.google.com/drawings/d/1oJ6x8HaqiRiW7cl-WFoC24Gn1lOkpjIPyGak5cfryPw/edit?usp=sharing.

I shortly try to describe the difference to the current architecture:

free modules from module states. i.e. the position controller does not need to know about commander states nor navigation states.
keep the interfaces between modules slim. i.e. the position controller, attitude controller, rate controller and so on only need to have one input, and they all should not care about where the input comes from. the same is true for the navigator: its only purpose is to send out way-points. It should not care about being a fixedwing or mc.
all computation should be done in local frame. the navigator should convert all global setpoints first into local frame before doing any mission checks
introduce a new module Stick mapper. it basically deals with the stick inputs
introduce a new module Trajectory (or whatever name makes sense). This modules fuses different inputs together (from an obstacle avoidance module, navigator module, stick mapper module or from offboard) and has ONLY one set of setpoints as output. The setpoints can either be position/velocity/acceleration setpoints, attitude setpoints or rate setpoints.
rate controller should run faster than attitude controller

In addition to this diagram, each module itself should have a similar diagram. For instance, in the current proposal the Trajectory module takes over some of the work of the navigator and position controller. Consequently, the Trajectory module could possibly become large.

Open questions and suggestions:

1.) race condition between two modules which publish the same message:

This can occur for instance if the vehicle is in position control and the attitude controller generates rate controller setpoints, but then the user switches to rate control mode and the the setpoints are directly generated from the stick inputs. During the switch, it can happen that attitude controller module and the trajectory module (or possibly the stick_mapper) send the same setpoints Solutions:

uorb priority

2.) hybrid control

In altitude control the z-position is controlled by the position controller, but roll and pitch setpoints come from the sticks.

3.) separate uorb from modules

Is there a good way to keep the modules separated from uorb such that the interface could possibly be exchanged as well?

4.) trajectory module pipeline

Should geofence and obstacle avoidance be after flight-task (where trajectories are computed) or before flight-task such that no-fly-zones can be incorporated directly into the trajectory generation. The first approach requires some sort of repeated loop (not really desired)

Stifael commented 7 years ago

commander refactor: https://github.com/PX4/Firmware/issues/7055 stick_mapper PR: https://github.com/PX4/Firmware/pull/6439 Issue: https://github.com/PX4/Firmware/issues/2822 global to local: https://github.com/PX4/Firmware/tree/global_to_local vtol structure proposal: #4845

dagar commented 7 years ago

It's really great to see this written down!

We'll need to review OFFBOARD and the complexity of position_setpoint (https://github.com/PX4/Firmware/blob/master/msg/position_setpoint.msg).

mcharleb commented 7 years ago

This is a great step forward in making the code a bit more approachable for developers.

Stifael commented 7 years ago

as far as i know, the offboard needs some rework for sure. right now I think the triplets are exploited for offboard... We were not entirely sure where the offboard setpoints should be handled. this definitely needs farther discussion

mcharleb commented 7 years ago

@Stifael What do you think about platform support layering? Would it help to factor out Linux, QuRT and NuttX support into separate git repos? One of the concerns that I have heard about PX4 is the size of the codebase, but a lot of that is platform support and drivers. Using git submodules makes it difficult to use different collections of layers (NuttX, DriverFramework, etc). Perhaps something like repo would provide a cleaner way to aggregate layers for a particular platform build.

Stifael commented 7 years ago

@mcharleb: I am not sure If I understood your suggestion correctly: do you suggest to have everything related to a specific platform in one repo? Basically that would mean that DriverFramework would be part of the Posix repo? Anyway I think @bkueng or @julianoes can better answer that question

Stifael commented 7 years ago

Since this "issue" could become large, I will add suggestion/questions to the top comment.

bkueng commented 7 years ago

@mcharleb

Would it help to factor out Linux, QuRT and NuttX support into separate git repos? One of the concerns that I have heard about PX4 is the size of the codebase, but a lot of that is platform support and drivers

Factoring that out would not reduce the code base. And personally I find it a bit of a hassle coping with submodules - having more makes things worse. Especially if the changes affect multiple repos. repo might help. However I think the more important thing that people can easier cope with the code base is having a good directory structure with clear platform separation & API's (which we more or less have), avoid platform-specific ifdefs in general modules (@davids5 helps pushing that), and documentation. This includes some high-level overview graph like the google doc above, module documentation (#7145), and arch-specifics for each HW we support (which could also use some improvements, especially for Snapdragon, since it's the most exotic architecture).

dagar commented 7 years ago

@bkueng @mcharleb one minor thing on that note is if we confine NuttX to a single directory it doesn't even need to be cloned by default. The build system can handle that on demand as a dependency only if building a nuttx config.

I've personally experienced a lot of pain with our submodules (especially the submodules within submodules), but still find them to be the least bad option.

julianoes commented 7 years ago

The build system can handle that on demand as a dependency only if building a nuttx config.

I would like that but I'm not sure how to do it technically. Just script git cloning?

MaEtUgR commented 7 years ago

Thanks @Stifael, it was a very good idea to create this issue. Finally the graph I started and tried to get spread finds attention and can serve as Request For Comments 😃

davids5 commented 7 years ago

@dagar - whatever we do here for NuttX please ensure to keep the development mode and not auto update it. GIT_SUBMODULES_ARE_EVIL=true

davids5 commented 7 years ago

@dagar - I have had issues with some DF submodules. I have had to rm -fr directories that I do not ever do any work in. I then use my goto command gitsync

#!/bin/bash
git submodule sync --recursive && git submodule update --init --recursive

I also find this command a big help gitlast

#/bin/bash
git for-each-ref --count=30 --sort=-committerdate refs/heads/ --format='%(refname:short)'

jgoppert commented 7 years ago

I support @mcharleb 's suggestions. @bkueng This type of layout should actually reduce the number of submodules you have to deal with for a given build.

My dream repo structure:

PX4 Library

My concept of this library is basically like ECL but using uORB and defining entire PX4 modules, or just call it ECL if we want to stick with that name. The benefit would be that these modules would be reusable and easily interchangeable, only dependent on uORB for internal message passing and an OS interface

Submodules:
- mavlink
- math library
- ecl (or just incorporate the source into the PX4 library)
OS abstraction layer (think we already have this with PX4_...)
uORB Message definitions (internal messaging) -> prepare for possible switch to ROS2
MAVLink submodule (external messaging)
C++ modules that use those message definitions and OS abstraction layer
- Estimators
- Controllers
- Message Handling
Include intense code verification and inspection here, abstract the low level driver data when testing
External interface to APM: Currently ECL is also used by APM. The question would be how hard it would be used use this new type of library directly for APM. I think it would be possible since they would only need to support the OS API calls and uORB.
CMake Output: Ideally a binary library depending on the cmake toolchain provided

PX4 Gazebo

submodules or CMake External
- PX4 Library
gazebo models
gazebo plugins
CMake Output: plugin libraries for gazebo, just make flight control a plugin as well? might simplify multi-vehicle use-case, alternatively, stick to udp style and create px4 posix executable

PX4 OS Posix

submodules or CMake External
- PX4 Library
  - mavlink
  - math library
Posix drivers
CMake Output: posix executable

PX4 OS Nuttx

submodules or CMake External
- PX4 Library
  - mavlink
  - math library
- NuttX
CMake Output: nuttx image

Downsides of this

Each PX4 OS repository would have it's own commit version of the PX4 Library that it was using. This could be somewhat mitigated by have the CI for the core repo try to build a set of OS repos, and if they fail then report a failure, if they succeed, then automatically push the updated version. Or as @bkueng suggested use a tool like repo, that just uses a branch name instead of an actual commit, but this is hard for unit testing, since you have no guarantee that a commit that passed CI will still pass CI when someone updates a branch on a dependent repo a week later..

bkueng commented 7 years ago

@jgoppert ok if you put it that way it does look nice! But one more downside that I see: I often test things in SITL and then directly on an NuttX board. Switching repos in between would make that workflow much harder. Also what about mixed architectures like Snapdragon? It has 2 OS'es. I could also imagine an architecture that has NuttX & Linux, for example when we want to run some modules on the Linux side on the Aero. Cutting these into separate repos makes things more difficult. My take-away from this is that we need a good interface/abstraction towards the OS.

bkueng commented 7 years ago

I created another doc with the current state & changes in red: https://docs.google.com/drawings/d/1oJ6x8HaqiRiW7cl-WFoC24Gn1lOkpjIPyGak5cfryPw/edit?usp=sharing. This should help us see which changes are required and which ones are independent from others. Feel free to correct, modify, and/or extend.

We should also take FW, VTOL, Rover, Boats, Rockets, ... into account (There was some discussion about VTOL in #4845). @AndreasAntener @tumbili @sanderux

jgoppert commented 7 years ago

@ bkeung Some other take-aways for me:

PX4 should evolve to become a flight control library
There should be no driver code in the PX4 library
The PX4 library should provide a set of nodes all using a common pub/sub framework with message definitions

For pub/sub we can stick with uORB for now with a ROS adapter until ROS2 is ready. When ROS2 is ready, I think uORB should be gone. The PX4 OS drivers can all live in one repo or separate repos, whatever we find more convenient.

Stifael commented 7 years ago

@jgoppert Do you suggest a wrapper around uorb? How "common" are all the different pub/sub methods? Im asking because there was the discussion to incorporate priorities into uorb, or even fault detection for some of the messages. I guess that would then break the pub/sub methology?

jgoppert commented 7 years ago

Well uORB currently works in practice while ROS2 doesn't. Whatever we need extra to do our job we should make sure it makes it into ROS2 whille we still have input. I watched a talk on ros2 and they are definetly interested in supporting our use case. They are doing a lot of work on latency and real time. Currently they are very similar but ros 1 doesn't have priority and uorb doesn't have queue size etc. In my iekf branch I already wrote a ros api port using uorb as the backend, and there has been some other work on a middle layer as well, so definitely doable.

mcharleb commented 7 years ago

Where should architecture documentation go? I've started to create UML diagrams for VDev as requested on the last PX4 Dev call. The WIP diagrams are at https://github.com/mcharleb/px4_uml_diagrams. The most useful one would be https://github.com/mcharleb/px4_uml_diagrams/blob/master/vdev_callflow.png?raw=true

davids5 commented 7 years ago

@mcharleb - Thank you doing these. This is super helpful!

nicolaerosia commented 7 years ago

Does anyone have a doc presenting all PX4 threads ? I'm interested in pinning threads to specific CPUs to reduce cache-trashing, reduce jitter and improve latency.

bkueng commented 7 years ago

@nicolaerosia no we currently don't. This depends very much on the setup & configuration. You can grep for px4_task_spawn_cmd or look at the running threads (they are named accordingly).

Stifael commented 6 years ago

I want to reopen the discussion about Flight-stack (not middleware), because there are still many unresolved questions unresolved in terms of obstacle avoidance, geofence, failsafe and flighttask integration.

Mode/Task switch Flighttask current implementation checks for subscription failure and if present will not start the task. When switching to a task the old task is deleted, which leads to following concerns:
- What is the state of Flighttask if a task cannot be started and the old task is already deleted?
- Where does the switching occur for flighttasks? I think it should be done in the commander who is responsible for fail-safe and pref-light checks. This means that flight-task should always listen to commander and switch only if commander demands to do so. This makes the checks for subscription failure somewhat obsolete. I think it is important that flight-task does not become a new state-machine.
Constraints How should we deal with constraints? For instance: lets assume that altitude limitation is part of geofence. In the proposal-diagram geofence comes after navigator. If now navigator is in the state of RTL where the altitude is set to some altitude above the maximum altitude allowed in geofence, then the RTL altitude has to be updated somehow, otherwise the vehicle will get stuck. The same problem can occur with obstacle avoidance, where a navigation point will never be reached because of some obstacle that prevent vehicle from flying to the desired waypoint. Possible solution:
- create a uorb topic that contains all constraints
- move geofence/obstacle avoidance before navigator ...
Smoothing of setpoints Smoothing should be its own class that happens at the end of any flighttask. Of course only if the specific task requires smoothing.

hamishwillee commented 6 years ago

@mcharleb Did you ever get an answer to "Where should architecture documentation go?" (https://github.com/PX4/Firmware/issues/7318#issuecomment-306248103).

If not, then IMO we currently don't have a particularly good place for architectural docs. I suggest we add them to an "OK" home, and then find a better one later. The images could live in the devguide in /assets/architecture/. We should have some pages in /en/architecture that provide an overview/context for the images and then display them. In the first instance I'd just list them as children of https://dev.px4.io/en/concept/architecture.html

MaEtUgR commented 6 years ago

@Stifael

People reading this curretnly cannot find the code you are refering to because I didn't rebase and push the newest version of the flight task architecture to this repository yet.

What is the state of Flighttask if a task cannot be started and the old task is already deleted?

I suggest the vehicle stops and waits for further commands. If the vehicle is used autonomously or RC is lost or position lock is lost -> failsafe.

Where does the switching occur for flighttasks?

I suggest to not generally use the commander for all of that. It should still check for minimal requirements and do failsafe and hence also be able to override at any time but not have the responsibility of handling the switching and data and error cases of every task. That would not scale at all and probably make the architecture obsolete because the goal is to free the commander and position controller from handling all the different cases for every feature. The task should decide if he's capable of running and detect internal errors and the architecture around it should handle basic switching.

But this could only work if the smoothing itself was not dependent on the specific task operation itself. E.g. currently during manual positon control flight (where we have smoothing now) it depends on the stick input in a specific way. How about having a smoothing library that can easily be used in any task but let the task decide how it's in the end done?

Stifael commented 6 years ago

@MaEtUgR

People reading this curretnly cannot find the code you are refering to because I didn't rebase and push the newest version of the flight task architecture to this repository yet.

I know that, but the discussion started here and should also continue here. How, otherwise, can ppl contribute and work towards the same goal? For instance, what if there is also refactor in progress that does not agree with FlightTasks? One purpose of this thread was also to update the proposal, otherwise this thread has no use at all.

That would not scale at all and probably make the architecture obsolete because the goal is to free the commander and position controller from handling all the different cases for every feature.

I did not know that the goal of FlightTask is to take over responsibility from the commander. Originally I thought FlightTask takes over parts of the position controller and navigator only, basically everything that is related to position/velocity/acceleration/thrust control. One can make FlightTask bigger than it has to be. Error checking and doing the correct switching could also have been part of an interface before or within commander, as it was suggested in the architecture proposal as well. Otherwise there is no guarantee that it will blow up as well.

I suggest the vehicle stops and waits for further commands. If the vehicle is used autonomously or RC is lost or position lock is lost -> failsafe

Lets assume the vehicle is already in a subtask of position control mode such as orbit, and for whatever reason this task gets deleted. How can the user switch back to legacy/default position control (which is also a task)? Does the user have to demand it? And how? If a task cannot be started, what would be the feedback?

How about having a smoothing library that can easily be used in any task but let the task decide how it's in the end done

That's what I meant with having a smoothing class. It does not have to be library though.

MaEtUgR commented 6 years ago

Totally agree. Just wanted to point it out.
My goal is that the commander does not need to take additional responsibility than it already has specific for every task. I'm not saying it should take over the responsibility the commander has right now neither do something the navigator already does. The goal is only that it scales with more and more tasks which means you should be able to easily add your new task and have it self contained without needing to change the commander adding a switch for the task and all the data handling and checks. Handling basic flight modes like rate, attitude, position and failsafes and checking if the vehicle is in good shape stays responsibility of the commander. Determining what the vehicle does in position controlled flight like e.g. follow me and if it's actually possible because a target to follow is available is responsibility of the task.
Good question. Initially I thought the user needs to switch back by selecting the manual position flight mode on screen or with the mode switch but we can still define e.g. that if any position controlled task fails and the user has a remote control connected the fallback is to stop and enable the stick control. Likely a useful feature for ease of use.
agree

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

julianoes commented 5 years ago

I think the discussion will come up again.

MaEtUgR commented 5 years ago

@julianoes I think it's a very important topic to make sure we're all pulling on the same rope. Can we discuss some high-level possibilities in the next dev call?

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

MaEtUgR commented 4 years ago

We need to reflect on the topic and follow-up with adapted plans otherwise this issue is useless.

LorenzMeier commented 4 years ago

I'm closing this because this is just stale. We need to error on the side of having relevant issues open and actually following up and have not as much fear of loosing information.

Open source projects live and die around active contributors.

PX4 / PX4-Autopilot