Discussion

In relation to autowarefoundation/autoware_ai#396, we need to create the design overview of planning. According to @dejanpan , we can start this by defining inputs/outputs and logics in these 4 layers.

Global planning
Behaviour planning
Local planning
Control

Thanks for his suggestion, for example, https://github.com/CPFL/Autoware/issues/1677#issuecomment-435721786

Global planning => on demand service, interface to apriori map and traffic information, Behaviour planning => open problem, Darpa urban challenge: rule-based, other companies: search-based method, our favorite: robust POMDP Local planning => some deterministic sampling-based methods (GMT) or form of MPC, pre-recorded path, RRT Control

Inputs to this module: track predictions map pose

I think the Motion Layer in present Autoware architecture (https://github.com/CPFL/Autoware/wiki/Overview) includes 3. Local planning and 4. Control . I created the draft design about these and commented at https://github.com/CPFL/Autoware/issues/1677#issuecomment-438166365 . component_diagram_motion

@aohsato: @makokal volunteered to lead the redesign for this package. I will chime in after his initial draft.

Responding to @dejanpan 's request for comment:

I think the main quality of a good stack or architecture is that it should be easy to extend.

I think this is especially important as Autoware is a very ambitious open source project.

Generally, to achieve a stack that is easy to extend, the following is very helpful:

Separation of concerns: each logical entity (class, node etc.) does one thing:
- Easier to understand
Well defined interfaces

With respect to this, I think from a conceptual, robotics algorithms level, the standard planning stack of:

Global planning
Behavior planning
Motion planning
Control
Vehicle interface

Is a good one, as it logically separates planning into distinct algorithm domains, making it easy to reason about planning from an algorithm perspective.

How, within each of the five components, each one is structured is a matter of debate, whether it be more class-based, or node based. I personally favor a more monumental, class-based means of separating concerns.

With respect to this, I would advocate that we define standard interfaces (e.g. topic names and types) to each component of the stack.

If we can achieve this, then we would be able to interchange algorithms at each level from various contributors.

The thinking then is that we could have a hierarchy of launch files, e.g.

# planning.launch.py
global_planner = 'global_planner.launch.py'
behavior_planner = 'behavior_planner.launch.py'
# .. etc..

Where then each particular launch file can e.g. dispatch to many nodes, as in @aohsato 's design, or a single node, if it is from another contributor who designs it that way.

That said, here is a first proposal for interface messages. Here, I am favoring standard messages where possible:

Global planning
- Input: Waypoint?
- Input: PoseWithCovarianceStamped
- Output: Waypoint?
Behavior planning
- Input: PoseWithCovarianceStamped
- Input: Waypoint?
- Output:
Motion planning
- Input:
- Input(Optional): PoseWithCovarianceStamped
- Output: JointTrajectory
Control
- Input: PoseWithCovarianceStamped
- Input: JointTrajectory
- Output: AccelStamped
Vehicle interface
- Input: AccelStamped
- Output: Direct commands to the drive-by-wire interface (e.g. voltage signal)

Remarks:

The input for a vehicle interface is because in my experience there you typically map voltage to an acceleration command
JointTrajectory allows you to capture higher derivative information i.e. in the case of trajectory planning
PoseWithCovarianceStamp can be used instead of PoseStamped because uncertainty maybe used in implementer's algorithms. IMO the memory burden and communication overhead should be negligible in this message
Global pose is an optional input for the Motion planner as in most senses it is a local planner, so a fixed position is not strictly necessary depending on how you set up the problem
Input/output for global planner depends on how locations are broadly represented in the global frame and in the map. For this we could maybe use something like Waypoint with is extremely broad, but I think this, along with the interfaces for the behavior planner are open questions

Just some suggestions and discussion points. I think on the motion/local planning and control level, those proposals are fairly concrete, as the interfaces here a pretty well understood. At the higher behavior and global planning levels, it might be challenging to design a good interface, since the solutions and problem settings become a lot more broad.

Hi @cho3 @dejanpan @aohsato , I am late to this conversation so I thought I start with aligning the task.

In my opinion, let's separate the design first into functionality before going into details on the interface types. This means hashing out the role of these modules first and also deciding which ones are processes vs libraries. A process communication with a library does not have to go through ROS which is sometimes very useful. Afterwards the interface details become more apparent.

Here is a first shot at this.

Processes

Mission planner
- Determine the sequence of lanes and lane segment for a concrete task like drive to parking lot.
Behavior/Maneuver planner
- Assess the scene with help of predictions to determine relevant constraints for motion planning, i.e. produce motion plan specification.
- All calls motion planner and assesses the output using additional libraries like a constraint checker to measure adherence to say safety rules.
Controller/Tracker
- Track a given motion plan and continuously report status upwards.
- Vehicle interface should be abstracted away from this to allow for interchanging of controllers, e.g. versions of pure pursuit.

For these process commands from flow the top (mission -> behavior -> motion -> control) with simultaneous increase in information update. Controller runs fastest (100hz, 50hz). Status information is reported upwards. This ensures a single source of 'truth' to allow for consistent decisions.

Libraries

Motion planning
- Ego lane planner
- Lane change planner
- Parking planner
- etc, loadable and switchable via the behavior process.
Prediction (from perception)
Constraint checker
Scene model
- Representation of the map around Ego and convenience functions for manipulation it. This heavily relies on a proper abstraction of the underlying map format.

With this separation, ROS communication is only needed between the processes.

Additional note: parameter handling is crucial across the full hierarchy here, so I highly recommend using namespaces like /motion_planner/ego_lane/param_x

@aohsato @cho3 @dejanpan comments on ^^?

@makokal :turkey: :turkey: :turkey: Your comment fell a little awkwardly before the short holidays :turkey: :turkey: :turkey:.

Regarding what is written, I broadly agree. I was under the false impression that some of these were a given, but it's best to be as clear as possible.

As far as general development practice goes:

All code should generally be as decoupled as possible to minimize complexity
This implies any ROS code should generally be a boilerplate wrapper around some class in a library

RE Mission/global planner:

I generally agree with how you set up the problem here
I'm not sure if you're intending this or not, but I have some concern that a "lane sequence" might be taken to be too binding depending on the lower level planning setup (e.g. use case I'm thinking here is driving on a long highway--your lane sequence is just one for 100 miles, then an exit)
For above, we might want to consider adding some label of "constraint" for some elements of the sequence that denote that a certain lane segment must be satisfied by the subplanners, but that's more an interface discussion

RE Behavior planner:

Generally agree with the output (i.e. setting up relevant constraints for motion planner as a result)
Again, depending on how the problem is set up, map information may/should be needed here
Constraint checker makes sense--we could probably shove the dependency on maps into the constraint checker and into prediction
If I'm understanding you correctly, you're aiming for the maneuver/motion planner to be embedded in the behavior planner--I don't strictly agree with this:
- My main concern is that in pre-specifying a certain architecture, you are implying a certain kind of solution, which I think is not strictly what we're going for with Autoware
- The way I understand it, this tight coupling lends itself more to a classical/deterministic planning paradigm, where it makes sense to have many or more calls to a maneuver/motion planner
- This would sort of preclude a more decoupled, probabilistic planning paradigm, where the requested maneuver can succeed or fail
- Of course, one thing we can say is that the behavior planner and maneuver planner are tightly coupled, so we make no prescriptions on how the two interact, only what the inputs to the two combined are, and what we expect the output of the two to be in aggregate
- Further, if we keep things as nice libraries (especially motion planners) with a consistent interface, then it should be trivial to break out motion planners into nodes for a more decoupled intermediate planning paradigm

RE Controller:

Agree, but I'm not entirely sure what status would be reported upward, tracking error?

RE libraries:

All sound good
For something like motion planning, again I would prescribe some kind of consistent interface to make life easier, and I'm perfectly fine with making them loadable and switchable as long as it doesn't preclude other modes of using these libraries (e.g. as standalone nodes, etc.)
Seems to me like scene model and constraint checker are at least lightly coupled--we would probably need to get a good understanding/list of constraints and behaviors and iterate back and forth (from a planning architecture perspective) to get a better scene model

I guess my main comments in short are:

We should go further and make everything libraries, and have ROS code only be boilerplate
I think we should be careful about over prescribing structure in the architecture so we don't block off potential solutions
We should maybe treat motion + behavior planning as a unit from the input/output perspective, with some recommendations on how they might interact
We need a planning interface (look at OMPL?)
Scene model is important--we should look at how other people do it too

@cho3 I concur very much.

RE- global/mission planner - Yes, lane sequence is no the right abstraction, I think waypoints (with some possible semantic meaning) is a good start.

RE-Behaviors:

I think that embedding motion planner in a maneuver planner is just one of many possible architectural setups. I agree that interfaces for motion planning should we well defined to allow for both (use in bp, as well as standalone). Having motion planning as a library of planners encourages this separation.

RE- controllers

I envision status updates about the health of the node itself as well as any online metrics. Again just interfaces are key here and the concrete implementations can be left to user. The idea is to equip high level modules with some capacity to intervene for some recoverable cases instead of just resorting to say emergency stop is things go bad.

I also agree about keeping things as ROS agnostic as possible and only adding wrappers to tie the modules to ROS at the end. For scene modeling , it is the piece hard to find examples of, yet is can vastly affect the interfaces in motion planning (e.g. representation of obstacles)

I haven't looked deeply in OMPL, but AFAIK it is mostly geared towards manipulation planning, so I am not sure how much of that interface fits AD.

Example MP interface I had in mind:

class MotionPlanner {
// ...
  virtual ReturnType plan(
     const WorkspaceType& ws, 
     const MPTask& task, 
     const MPConstraints& constraints, 
     ...) const = 0; 
//...
};

The have a factory for registering and initializing motion planners.

...
  MPType create(const MPParams& params, ...);
...

RE global planner:

I think how we handle this might depend on what paradigm is chosen for handling maps
Mainly I see 2-3 ways of interacting with maps:
1. Shared database (e.g. each relevant node can access/query the same map database)
2. Decoupled (e.g. only top level node, such as global planner, has access to map, and only relevant info is passed via messages down)
3. Server (similar to 1, but map is a ROS server, rather than some indeterminate object in shared memory)
1 probably wouldn't work out too well in a multi-machine context (of course you could have one instance per machine)
The upshot of 1/3 is that it simplifies the messages passed around in the planning stack, though you're pushing the coupling away to the map abstraction
If we pursue a 1/3 style approach (where each relevant node has direct access to the map DB), then I would suggest something like Uint64MultiArray for a first pass. The main reason is that the assumption that each map entity should be uniquely identifiable by 64 bits seems fairly reasonable, and plus this is a built-in type which makes life a little bit easier as a first pass
If we shoot for something like 2, then I have no idea. I think for that we would have to come up with a couple variations of map-based prediction, localization, and behavior planning to get a sense for what would be needed there

RE-controllers:

Do you have any concrete, high level behaviors you're envisioning here? Or even in the case of recovery behavior for each component of the planning stack? I ask this because I think it's impossible to design a reasonable interface without some concrete use-cases, even if we push the problem of implementation to others

RE-interface:

OMPL has a very reasonable API and separation of concerns for motion planning that at least bears some studying
MoveIt! IMO is more ROS-y in terms of it's paradigms, and less useful for general motion planning
Some other things we would need to be aware of and think about in designing an interface are different kinds of planners and planning paradigms, e.g.
- Trajectory planning (combined lateral and longitudinal) vs path + velocity planning decomposition (e.g. what Autoware currently does)
- Optimization-based methods vs sampling-based methods vs hybrid (n-stage) methods
- etc...
I personally take more of an optimization-based view of motion planning (and everything else for that matter), so from this view, just some random things to consider for an interface/problem definition:
- objective function (e.g. progress, smoothness, weights etc..)
- state space/motion model (e.g. dubins bicycle, first order wheels etc.., what level of derivative to control..)
- time horizon
- discretization level (in time or space domain)
- initial conditions
- state space constraints (e.g. 0 <= x <= 10)
- obstacle constraints
- waypoint constraints

RE-example-interface

If I'm understanding your interface correctly, why would we need workspace to be a parameter rather than keeping the workspace as a set of class variables? Or do you mean workspace to mean something like the broad solution domain of the problem (e.g. 0 <= x <= 10)?

RE global planner:

* I think how we handle this might depend on what paradigm is chosen for handling maps

* Mainly I see 2-3 ways of interacting with maps:

  1. Shared database (e.g. each relevant node can access/query the same map database)
  2. Decoupled (e.g. only top level node, such as global planner, has access to map, and only relevant info is passed via messages down)
  3. Server (similar to 1, but map is a ROS server, rather than some indeterminate object in shared memory)

* 1 probably wouldn't work out too well in a multi-machine context (of course you could have one instance per machine)

* The upshot of 1/3 is that it simplifies the messages passed around in the planning stack, though you're pushing the coupling away to the map abstraction

* If we pursue a 1/3 style approach (where each relevant node has direct access to the map DB), then I would suggest something like [Uint64MultiArray](http://docs.ros.org/jade/api/std_msgs/html/msg/UInt64MultiArray.html) for a first pass. The main reason is that the assumption that each map entity should be uniquely identifiable by 64 bits seems fairly reasonable, and plus this is a built-in type which makes life a little bit easier as a first pass

* If we shoot for something like 2, then I have no idea. I think for that we would have to come up with a couple variations of map-based prediction, localization, and behavior planning to get a sense for what would be needed there

I agree with the options. Just want to add one more thing that the full map or a large portion of it is not needed by most of these components. Only routing and lane level reasoning need access to large portions of the map and they mostly use topological information and some geometric info. I think these can be represented in some object build from a map (which could be living in a server/database etc) and the object is updated rarely. More intermediate components like motion planners, behavior planner and controllers do not really need access to a map but rather just a definition of workspace. These could again be objects build from either the map portion used in routing or a combination. These workspace objects will need to be constructed every time a planning 'call' is to be made so it is important the process if efficient. For optimization based planners for instance, these workspace object could then represent continuous spaces. This decouples how localization interacts with maps. The routing and planning modules here then only consume output of a localization module (as some 'State' object, again an interface).

RE-controllers:

* Do you have any concrete, high level behaviors you're envisioning here? Or even in the case of recovery behavior for each component of the planning stack? I ask this because I think it's impossible to design a reasonable interface without some concrete use-cases, even if we push the problem of implementation to others

Mainly state of the controller process, as in health and online tracking errors. The idea is to provide a channel for a user to propagate such diagnostics upwards and do something with them if need be or just log for debug purposes.

RE-interface:

* [OMPL](http://ompl.kavrakilab.org/api_overview.html) has a very reasonable API and separation of concerns for motion planning that at least bears some studying

Took a look, I like their models for spaces, objective functions and all. I think we can borrow a lot from these. What I didn't see (maybe didnt look hard enough) was arc length parameterization type spaces, i.e. Frenet coordinate which are prevalent in AV.

* [MoveIt!](http://moveit.ros.org/documentation/concepts/) IMO is more ROS-y in terms of it's paradigms, and less useful for general motion planning

* Some other things we would need to be aware of and think about in designing an interface are different kinds of planners and planning paradigms, e.g.

  * Trajectory planning (combined lateral and longitudinal) vs path + velocity planning decomposition (e.g. what Autoware currently does)
  * Optimization-based methods vs sampling-based methods vs hybrid (n-stage) methods
  * etc...

* I personally take more of an optimization-based view of motion planning (and everything else for that matter), so from this view, just some random things to consider for an interface/problem definition:

I like the optimization approach very much too. I think we can have some basic interfaces for all of these and have a reference implementation of an optimization based planner from which other users can derive and add more.

  * objective function (e.g. progress, smoothness, weights etc..)
  * state space/motion model (e.g. dubins bicycle, first order wheels etc.., what level of derivative to control..)
  * time horizon
  * discretization level (in time or space domain)
  * initial conditions
  * state space constraints (e.g. 0 <= x <= 10)
  * obstacle constraints
  * waypoint constraints

RE-example-interface

* If I'm understanding your interface correctly, why would we need workspace to be a parameter rather than keeping the workspace as a set of class variables? Or do you mean workspace to mean something like the broad solution domain of the problem (e.g. 0 <= x <= 10)?

Partially answered above, but the idea is to have planners separate run time changaeble things from configuration things. Params are set at initialization and kept as member variables (with hooks to updates at some arbitrary rate), then the problem/task description which entails the workspace, constraints, etc are set on each call. This allows to also log these and be able to recover the 'exact world' the planner used to solve the task.

Finally, I think the discussion so far has been very fruitful, how should we go forward now? UML style design?

@makokal Glad this discussion was helpful.

I guess for next steps we need to consider what the purpose of this discussion to hash out an architecture was. IMO the point of hashing out an architecture is so that we can establish standards to:

Make the code quality better
Make it easier for developers to develop algorithms that can interact with other developer's algorithms

In light of that I guess some kind of document would be best. IMO it should probably have:

General development guidelines (e.g. ROS code should only be boilerplate, turn on all warnings, not doing XYZ will get your MR rejected, etc. most open source projects have developer guidelines IIRC)
High level description of architecture (e.g. simple block diagrams) + high level descriptions of what each component does (high level UML here?)
More detailed descriptions of the architecture and components (class-level UML here?):
- the more concrete expected behavior of each component
- the API's for commonly defined libraries (i.e. motion planning)
- the interface (i.e. message definitions)
- some sample use cases and expected behavior in each case

@makokal Glad this discussion was helpful.

I guess for next steps we need to consider what the purpose of this discussion to hash out an architecture was. IMO the point of hashing out an architecture is so that we can establish standards to:
1. Make the code quality better

2. Make it easier for developers to develop algorithms that can interact with other developer's algorithms
Very well articulated.

In light of that I guess some kind of document would be best. IMO it should probably have:
1. General development guidelines (e.g. ROS code should only be boilerplate, turn on all warnings, not doing XYZ will get your MR rejected, etc. most open source projects have developer guidelines IIRC)
Should we do this on this repo, maybe via a series of Markdown docs or elsewhere?
2. High level description of architecture (e.g. simple block diagrams) + high level descriptions of what each component does (high level UML here?)

3. More detailed descriptions of the architecture and components (class-level UML here?):

   * the more concrete expected behavior of each component
   * the API's for commonly defined libraries (i.e. motion planning)
   * the interface (i.e. message definitions)
   * some sample use cases and expected behavior in each case

I think we should get this tarted right away so that any lingering issues/decisions can be surfaced sooner.

@makokal

Should we do [development guidelines] on this repo, maybe via a series of Markdown docs or elsewhere?

I think standard practice is that we have a (series?) of markdown documents in the repository wiki (e.g. Developer/Contributor Guidelines etc.). We could also have a section of the readme in the planning section of the repo, e.g.:

# Example file structure
AutowareAuto/
| src/
| | planning/
| | | README.md # <-- here
| | perception/
| | # ...

If the larger project also takes up some of these guidelines, we could also have the highlights be in some kind of message when a user is trying to open a MR (I don't know if this is possible).

I think we should get this tarted right away so that any lingering issues/decisions can be surfaced sooner.

Sounds like a plan. Did you want to take a first stab at it? If not, I should hopefully have some time to sit down and string some words together within a few days.

Dear @makokal , @cho3 this is very great discussion. I am looking forward to see your design document.

Let me share with you some thoughts:

Planning has very wide range of implementations and standards. so fixing the design on what you prefer or like (for example Optimization based trajectory planner) is not the way to go. you should think more generally.
You should not include vehicle_control/path_following in the planning design, it is different problem.
Planning modules should support simulation, planning based on HD maps, planning based on Occupancy Grid (free space planners).

- Kindly check OpenPlanner, it is part of Autoware:

Design and scientific bases, check paper https://www.fujipress.jp/jrm/rb/robot002900040668/
check the tutorial and illustration videos on YouTube. https://youtu.be/FKM8v79X3_s

Initially, OpenPlanner design was to separate Global Planning (way_planner) and Local Planning (dp_planner). you can still find the nodes and tested in simulation mode.
Secondly from last year we had discussion with Autoware team to divide dp_planner to several nodes (similar proposal as @dejanpan suggested. check the pull request: https://github.com/CPFL/Autoware/pull/1400

There is one library which contains the Planning functions and Logic op_planner. the nodes only concerned with message passing, parameters, synchronization, visualization. (same as suggested) with the addition that I like to separate the visualization in the future.

The only enforcement for others to implement their own sub modules is the nodes interface definition: OpenPlanner.Nodes.pdf

autowarefoundation / autoware_ai

[Discussion] Planning Architecture #419

Discussion

Processes

Libraries