Open faultfactory opened 3 years ago
but the struggle here is dropped messages at the ROS layer.
It is the first time that such an issue has been reported. At which frequency are your messaged coming in?
Do you see that loss as well, if you try it with socketcan_bridge?
All of the traffic I need to have a complete message is making it to the interface which i can verify via candump (can-utils, not the candump in this repo).
Did you try it with candump in this repo?
where we are using can::SocketCanInterface [...] This does capture nearly everything and runs significantly more reliable but it does consume a significantly higher amount of resources.
Performance-wise there should not be much of a difference between the threaded and the non-threaded version. The first just runs the latter in a background thread. In both version there is one additional background thread for Boost ASIO anyway.
End result is we're collecting a specific # of can msgs, converting to engineering units from hex/binary and assembling a ros message from them. In many (nearly all) cases the receipt order and the completeness of the full package matters.
I believe that this might be the root cause of your troubles.
Please make sure that you never block any of the listener callbacks (especially, if messages are coming in fast).
socketcan_interface
does not do any input buffering at all.
However, the stack can read one message from the bus while the previous message is being processed.
This helps to deal with sporadic blocks.
You could use BufferedReader to decouple your user code properly.
It is the first time that such an issue has been reported. At which frequency are your messaged coming in?
It's all automotive can so it's 500kbps and the messaging frequency varies.. and the hardware does it's own buffering i believe at the USB interface. I don't know the exact timing but if i look at 1 second of the log i'm getting 2600 msgs on one bus but i'm taking action on a specific subset of that via the FilteredFrameListener. I actually am getting decent performance with the SocketCanInterface and driver.run() approach. The compute load just seems high. Only occasional missed frames.
Do you see that loss as well, if you try it with socketcan_bridge? SocketCanBridge does not appear to have that issue which is what caused me to use that as my framework initially.
Did you try it with candump in this repo?
Yes, that does work. Which is why I then took the driver.run() approach from that with the SocketCanInterface vs ThreadedSocketCanInterface.
Performance-wise there should not be much of a difference between the threaded and the non-threaded version. The first just runs the latter in a background thread. In both version there is one additional background thread for Boost ASIO anyway.
Interesting, I thought that Boost ASIO would have been the culprit. Sounds like it's not the case.
End result is we're collecting a specific # of can msgs, converting to engineering units from hex/binary and assembling a ros message from them. In many (nearly all) cases the receipt order and the completeness of the full package matters.
I believe that this might be the root cause of your troubles. Please make sure that you never block any of the listener callbacks (especially, if messages are coming in fast).
socketcan_interface
does not do any input buffering at all. However, the stack can read one message from the bus while the previous message is being processed. This helps to deal with sporadic blocks.
Our callbacks are quite small. When profiling the code, the conversion library code functions made up 10% of the load vs the Boost ASIO and interface stack took up the remaining majority. I suppose the publish could be the culprit since we are not using ros::spin() but i have not looked at that. We are using the Vector DBC library from Toby Lorenz to do our conversions but the library is very efficient. I would be surprised if this was the issue.
You could use BufferedReader to decouple your user code properly.
This looks like exactly what I need.
Do you know if this buffered reader would respect the CAN message order (FIFO)?
Do you have any examples of this being used that I could take a look at? Would I be using in place of the SocketCanInterface or alongside? Would it matter if i used the threaded interface or not?
Our callbacks are quite small.
This does not mean that they cannot block ;) But the issue is more the grouping of the CAN frames.
Do you know if this buffered reader would respect the CAN message order (FIFO)?
Yes. Actually, this is guaranteed for all callbacks.
Do you have any examples of this being used that I could take a look at?
Would I be using in place of the SocketCanInterface or alongside? Would it matter if i used the threaded interface or not?
This is independent from the actual CAN interface. It replaces one FrameListenener.
but i'm taking action on a specific subset of that via the FilteredFrameListener.
FilteredFrameListener
is not optimized for performance!
Every FilteredFrameListener
will process all messages from the bus, pass the matches and then just drop the rest.
How many are you using?
It might be faster, if you just write your own filter in one callback (or using a BufferedReader
).
Just run the can::Frame
trough your can::FrameFilter
instances and take the corresponding action.
It might be faster, if you just write your own filter in one callback (or using a BufferedReader). Just run the can::Frame trough your can::FrameFilter instances and take the corresponding action.
Or just create a FrameListener
for every single frame. You could use the same callback for each certain group.
The FrameDispatcher
uses a std::unordered_map
, which will be way faster than the FilteredFrameListener
.
How many are you using?
Sorry, this question isn't clear. What are you referring to?
Or just create a
FrameListener
for every single frame. You could use the same callback for each certain group. TheFrameDispatcher
uses astd::unordered_map
, which will be way faster than theFilteredFrameListener
.
Would the key in this case be can ID?
I will do some experimentation today and see if there's an improvement. Thanks for the help. If you need to close the issue, feel free but it would be good to continue documenting for others if that's okay with you.
Sorry, this question isn't clear. What are you referring to?
How many instances of FilteredFrameListener
are you using in your code?
Would the key in this case be can ID?
Yes. It is the CAN ID including the high bits for RTR and extended CAN IDs.
How many instances of
FilteredFrameListener
are you using in your code?
Only one. All the can traffic flows through that.
Only one. All the can traffic flows through that.
Then the performance should be okay.
Edited: Putting more than one question in.
I'm new to multi-process and there's a lot of templating here. I thought I was decent at C++ but I'm being humbled by this library. I might need a little more spoon feeding.
I looked over the BufferedReader
approach. I don't see any callback assignment like the other FrameListener
classes so if I'm understanding correctly, the call to readUntil
sits in your while loop condition and writes to a pointer you supply until the bool returns false. So I would on occasion need to manually trigger this while loop to consume the buffered messages with a ros::Timer event? Does this make sense?
I've looked at the multiple FrameListener
approaches but I think that may end up being more complicated as not all of the devices I need to ingest data from fall into the format I listed. It varies a bit. For each one of these I make a single callback message that takes the can::Frame
and does a switch/case on the ID. Not the best approach but it's close enough to working correctly that the decoupling alone might be enough to keep me working correctly and allow me to have a uniform format for these nodes that I have to create. New content comes in and I have to change them so the simpler the better.
I see this being used a lot but can't figure out it's purpose. What does the class_loader
do for this package?
I'm finding that any solution I attempt using the ThreadedSocketCanInteface is performing significantly worse than the non-threaded version. I don't know how I am able to use the non-threaded version with with the BufferedReader
My next step is to examine the build scheme for differences between the base install and mine.
looked over the BufferedReader approach. I don't see any callback assignment like the other FrameListener classes so if I'm understanding correctly, the call to readUntil sits in your while loop condition and writes to a pointer you supply until the bool returns false.
Exactly.
So I would on occasion need to manually trigger this while loop to consume the buffered messages with a ros::Timer event? Does this make sense?
I would not use a timer, but just use read all messages in a loop and process them. Actually, you might not even need BufferedReader in this case.
I'm finding that any solution I attempt using the ThreadedSocketCanInteface is performing significantly worse than the non-threaded version.
Again, there is not much of a difference, except for one additional thread.
I still believe that your user code does something, which might not work well with that many threads (or the way they are used in ros_canopen).
Of course, there could be a bug in ThreadedInterface, but I doubt that for now.
It is has been used for years in socketcan_bridge
and a similar technique is used in the other canopen packages.
I see this being used a lot but can't figure out it's purpose. What does the class_loader do for this package?
It is part of the (optional) plugin system: http://wiki.ros.org/class_loader
I've looked at the multiple FrameListener approaches but I think that may end up being more complicated as not all of the devices I need to ingest data from fall into the format I listed. It varies a bit. For each one of these I make a single callback message that takes the can::Frame and does a switch/case on the ID. Not the best approach but it's close enough to working correctly that the decoupling alone might be enough to keep me working correctly and allow me to have a uniform format for these nodes that I have to create. New content comes in and I have to change them so the simpler the better.
Since you have to do the switch/case anyway, I would suggest to only use a single callback. The current listener interface supports single ID matches only and if you need other selection criteria (ranges, masks) you have to implement your custom solution.
Hello,
18.04 melodic, installed via apt, verified we're up to date as of writing .
I am currently using the socketcan_interface to re-assemble object lists and other data groupings from a vehicle can network to pass to ROS as a prototyping platform for autonomous vehicle work. Some of this comes from vehicle can networks, some of this comes directly from Adas sensors but the struggle here is dropped messages at the ROS layer.
As an example, let's say the format is as follows with each message being unique:
They get assembled into an object list with all the parts and published to ros.
Or we're taking various vehicle state signals and packaging them into a single rosmsg and publishing them on a topic. End result is we're collecting a specific # of can msgs, converting to engineering units from hex/binary and assembling a ros message from them. In many (nearly all) cases the receipt order and the completeness of the full package matters.
As a template for building these nodes we started with the socketcan_to_topic node (and the corresponding .cpp) which uses the can::ThreadedSocketCANInterface. When running this, we would often can messages that were dropped.
In order to get this running in a more reliable manner I had to convert to using the candump.cpp file as a template where we are using can::SocketCanInterface (which i assume is a non-threaded version) and the driver.run() command which forces the program to be shutdown via ctrl+c. This does capture nearly everything and runs significantly more reliable but it does consume a significantly higher amount of resources. I have looked high and low for people with the same issue but I'm not seeing it. I've done some tests with this and I've found the following:
If anyone has an idea how i can reduce the dropped messages coming into this node, it would be very helpful. I don't believe any of our data conversion code is specifically high consumption so i would be very surprised if this was it. With the driver.run() method we do have an issue with throughput if we have more processes running on that same piece of hardware. As a result I think our next step is to move the can content off of the main PC and onto some other smaller machiens running ROS but this just seems like it's not such a high throughput that this should be necessary.