The delay is not problem. But scene is hard to use the framework.

uulm-mrm / ros2_def

Deterministic Execution Framework for ROS 2

https://uulm-mrm.github.io/ros2_def

Apache License 2.0

8 stars 2 forks source link

The delay is not problem. But scene is hard to use the framework. #8

Open fengmao31 opened 4 months ago

fengmao31 commented 4 months ago

The delay is not import for the industry, engineer can set orchestrator in the lower layer of middleware. It can avoid the serialization. Or avoid the serialation to recevie rawmessage, or use no-serialation rpc method such as capnp.

Firstly, I don't know which scene is suitable to use the framework. On the one hand, In some robot framework, we have the message fusion. only the main message came, it will active the main callback and send msg. When other message comes, it will be store and update or run its own callback. So in this situation, it can make sure the last callback can get information from other msgs. On the other hand, can you give the example in the automatic drive or robotics. The Inputs From Parallel Processing Chains and Multiple Publishers on the Same Topic situations.

Secondly, I cannot understand in your paper. When the timer trick node in your system, how to make sure the same input can get same output.

ottojo commented 4 months ago

Hi! Thanks for your interest in this and your questions.

Your first example is exactly one of the main situations this framework addresses: Some "fusion" node which receives multiple asynchronous inputs, and behaves differently depending on the order of the incoming messages.

I'm not sure what you mean about "automatic drive or robotics", this framework specifically uses ROS which is widely used in automated driving and robotics, and this is also the context in which it was created. "Parallel processing chains" describe the software modules preceding such a "fusion" component as in the previous example.

"Multiple Publishers on the Same Topic" is a situation which, in ROS specifically, presents some difficulty since it's hard to differentiate between the senders of messages on the topic. A use-case might be a sensor-fusion node which can receive compatible inputs from a number of sources, which may not all be known in advance (or change dynamically).

Regarding timers: I'm not sure what exactly you refer to, feel free to link a specific section. Generally, timers are not that different from topic inputs, just that they are triggered by an incoming message on the /clock topic instead of a message containing some input data. The node itself must ensure that the timer callback behavior is deterministic, the orchestrator only ensures that the execution order of the timer callbacks is consistent.

I hope this helps!

fengmao31 commented 4 months ago

Message fusion can handle multiple input messages with different frequencies. However, when aiming for the latest transformation (tf) with high frequency, ensuring the execution order of incoming messages becomes challenging in your framework.

I want more examples about "Parallel processing chains" and "Multiple Publishers on the Same Topic" in the scenarios in automatic drive or robotics area. It is hard to meet the situation that mutiple node send one topic. About "Parallel processing chains", I can give you a example about the actual scenarios that confuse me. Perception, due to point cloud processing, is a module that heavily consumes CPU resources, and its latency fluctuates. Its latency come from CPU load situation. Prediction, due to its interaction with models, is also a program with fluctuating latency. Its latency come from GPU load situation and the performance of the interactive interface. It is hard for me to ensure the the execution order.

Thank you for your answer regarding the timer. I understand your design. they are triggered by an incoming message on the /clock topic instead of a message containing some input data.

ottojo commented 4 months ago

In your diagram, the parallel chains would be "perception -> prediction -> planning" and "perception -> planning". For each input point cloud, the order of both callbacks at the planning node is not deterministic. This framework ensures that this becomes deterministic.

Parallel processing chains may also occur for example if multiple different perception modules process the pointcloud, with the output of both leading to the planning module (you can imagine for example one module which explicitly detects other vehicles, and one module which builds a grid-map for an alternative way of environment perception).

The "Multiple Publishers on the Same Topic" is admittedly less common, and i don't recall if it was even present in the example topology i examined in my thesis. I do think @authaldo may have an example of multiple sensors publishing grid-map type outputs, which all end up on the same topic for a fusion node to combine. But if that scenario doesn't occur in your application, that would not be unusual at all, and not a problem.

fengmao31 commented 4 months ago

Of cource, your framework bring the advantage of deterministic. However, when I contemplate how to implement your method within my diagram in an actual autonomous car, I've encountered dynamic delays in both the perception module due to CPU load and complexity of pointcloud processing and the prediction waiting for prediction results from GPU computing.

Multiple Publishers on the Same Topic can be use to send the trigger signal to call the record program to record the special scenario such as intersection, or exception. I add an example.

ottojo commented 4 months ago

If you refer to additional delays due to waiting encountered to ensure a deterministic execution order: Yes that's a known limitation of this approach. It is also why it's mainly targeted at simulation, testing and evaluation of ROS stacks, and mostly not used during runtime on the actual robot. "Dynamic delays due to CPU load" are just what happens, and are the cause of the indeterminism addressed here. The delay will still be dynamic, but the resulting callback order will not depend on it.

Processing latency will always be increased with this framework, due to the nature of how it works. If latency is more important than determinism, don't use it.

In theory, when playing back data from a ROS bag, throughput could even be increased compared to without the framework by playing back at maximum speed without dropping messages, but the speed increase is not guaranteed of course, and will depend on the topology of the ROS node graph.

If you are looking for a method for improving consistency in the execution time of your ROS callbacks, this will not help you. Perhaps you should look in the direction of RT linux (PREEMPT_RT) and more general realtime computing resources if you're concerned with timing guarantees. This frameworks only goal is ensuring a deterministic callback order, which requires sacrifices in latency.

fengmao31 commented 4 months ago

"Dynamic delays due to CPU load" are just what happens, and are the cause of the indeterminism addressed here. The delay will still be dynamic, but the resulting callback order will not depend on it.

If we ensure a deterministic execution order in this context, it could lead to fluctuations in the frame rate of the autonomous driving data link, which would be highly risky.

fengmao31 commented 4 months ago

I still think your work is very good job. Maybe we cannot apply your theory to global control, but it can be utilized to ensure a deterministic execution order locally. This could make the execution of callbacks more efficient. Your theory might also be meaningful for real-time scenarios.

ottojo commented 4 months ago

I'm looking forward to your research on increasing the efficiency of callback execution in ROS, and your application of my ideas for real-time use. If you want to increase performance of callback ordering here, i will gladly review your pull requests, or read your paper.

That said, anything that relaxes guarantees of global deterministic behavior is unlikely to be of interest within this framework. "A little bit of determinism" is not worth a lot, either the system is entirely deterministic, or it is undeterministic...