ros-drivers / ros2_ouster_drivers

ROS2 Drivers for the Ouster OS-0, OS-1, and OS-2 Lidars
https://ouster.com/
Apache License 2.0
134 stars 79 forks source link

Adding Libtins support #74

Closed MattSYoung closed 3 years ago

MattSYoung commented 3 years ago

Hello

I'd like to add a feature to the driver that allows it to use the open-source Libtins library to read packets on an ethernet device. This is primarily so that I can use the driver to parse packets being replayed from a pcap file using tcpreplay. The changes I would like to propose making are:

The question I have is: I have seen in other threads that there is a desire to maintain as much compatibility as possible between this driver, and the official ROS1 equivalent: ouster-lidar/ouster_example. A recent PR (#239) adds some pcap parsing functions. Should I try and use these? Or am I free to add libtins support as I see fit?

As a note: the pcap functions introduces in PR (#239) use Tins to read directly from a pcap file, whereas I want to read from an ethernet device (and have tcpreplay play the file through the same device).

SteveMacenski commented 3 years ago

All changes to the client must be made to the ouster_example first if you want to change it to use something different than what they use. The changes in ouster_example are often made without warning, so we need to keep total dependency on them so that every time there are changes we're not incurring large amounts of maintenance time to get back in line with it after our custom changes.

If you subclass the sensor and connect via this library without the use of ouster client at all, then I suppose that works too (or use parts of it as required). But we just don't want any physical change to those files that we have to track and cherry pick over when its updated.

If these other changes are in compliance with that client code, than I'd be OK to have a PR with this if it also included some readme instructions for how to use it (e.g. basic record / replay example commands)

MattSYoung commented 3 years ago

But we just don't want any physical change to those files that we have to track and cherry pick over when its updated.

Quick point of clarification - so the any changes to the ouster_example/ouster_client folder must be made in the original repo first, that makes sense. Does this also extend to any files or folders outside of this one?

If you subclass the sensor and connect via this library without the use of ouster client at all, then I suppose that works too (or use parts of it as required)

Creating a new child class of sensor::SensorInterface seems to be the design intent of this driver, so that you can swap in and out different versions of sensor::Sensor whilst maintaining the same interface with OusterDriver. So I believe that is the approach I should follow to add this functionality.

My intention is that sensor::SensorTins will not call the client at all. But because some function calls between OusterDriver and sensor::SensorInterface use the information from the client, I will need to fudge this somewhat using metadata read from file.

If these other changes are in compliance with that client code, than I'd be OK to have a PR with this...

This is why I ask about ouster_example/ouster_pcap. I don't know if you also want to maintain compatibility with this folder. If so, I will have to use the Tins functions added there.

SteveMacenski commented 3 years ago

Does this also extend to any files or folders outside of this one?

No, just things in the client/ directory of this project -- those are all copy-pastes of the ouster example client code. Basically, anything that ouster uses for core communication / parsing / etc we want to be exactly the same so there's no difference in the gathering of information from ROS1 to ROS2.

My intention is that sensor::SensorTins will not call the client at all. But because some function calls between OusterDriver and sensor::SensorInterface use the information from the client

That make sense, I'd support that. I do wonder if it doesn't make sense for another piece of software (non-ROS) to do this for replay, but I won't argue with more features :wink:. Especially ones that make my over-designing useful.

MattSYoung commented 3 years ago

No, just things in the client/ directory of this project

Perfect! That makes things straight-forward.

I do wonder if it doesn't make sense for another piece of software (non-ROS) to do this for replay, but I won't argue with more features

This is kind of what I'm aiming for. The way I intend SensorTins to be used is that the user can capture raw ouster packets in a pcap file, and then later replay that pcap using tcpreplay. My new sensor interface then listens to the ethernet port the packets are being replayed through, rather than expecting a real LiDAR to be present and failing when it doesn't find one.

Funnily enough, the reason I want to add this feature is precisely so I don't have the same problem of trying to maintain compatibility between my version and an upstream version of this driver.

MattSYoung commented 3 years ago

By the way, it looks like the ouster_examples repo was updated over the weekend for FW version 2.1 (PR 259). Would it be preferrable to bring these changes across in a separate PR, merged before the one I am proposing?

SteveMacenski commented 3 years ago

A separate PR, yes, thanks!

Totally understand. ROS2 bag is mostly ready, but for 3D lidars of high volume and large data its still a little flaky at times, it can be beneficial to use another more robust method and use this driver as the "gateway" into the ros ecosystem

MattSYoung commented 3 years ago

@SteveMacenski can you tell me what set of changes to the ouster_example repo was last ported to this one? It looks like there's at least two PRs worth of changes that haven't been ported. For example, I don't see the changes made in ouster_example/#239 in the client/ folder here.

SteveMacenski commented 3 years ago

None since the 2.0 conversion

MattSYoung commented 3 years ago

Right. I'll create a new issue ticket with a list of things to port, since some additions, like a sensor_config struct to types.h seem to already have an equivalent in the ros2_ouster driver (I'm thinking of the ros2_ouster::Metadata class).

MattSYoung commented 3 years ago

@SteveMacenski, I've finished the core changes to the ros2_ouster driver that I need to in order to implement a Tins-based driver, which you can find in this branch: https://github.com/MattSYoung/ros2_ouster_drivers/tree/ros2-libtins.

However I have yet to finish and clean up the implementation of the driver types. I've read the "Additional Lidar Units" section of the readme, and I think I understand most of your intent. But when it comes to choosing between different driver types, do you envision this to happen by choosing between options in the main file? Or by having one set of main and launch files for each driver type?

SteveMacenski commented 3 years ago

Different main / launch files could do. Also a parameterization might do, but I think different main / launch files would be appropriate.

MattSYoung commented 3 years ago

I'm happy to have separate main and launch files. But what is best practice for the config files? It would be less confusing to have one parameter file if possible.

SteveMacenski commented 3 years ago

that would be fine if none of the parameters change between them to use the same one for both

MattSYoung commented 3 years ago

Hmmm... and what about a parameter that us used for one but not the other? At the moment, the tins driver needs to know the ethernet device, but the default driver does not. I'm currently using the OusterDriver class for both Sensor and SensorTins, so I've had to add a get_parameter call for the ethernet device to OusterDriver::onConfigure even though this parameter is never used when the driver is run with Sensor. Is this acceptable?

SteveMacenski commented 3 years ago

Does the driver not also get the IP address of the device? I believe it does. Why not use that field? Also, I thought the purpose of this was to replay data, why are you needing to connect to a device at all?

This should be done in the Sensor classes implementation, not the larger driver itself if it is specific to the tins method

MattSYoung commented 3 years ago

Does the driver not also get the IP address of the device? I believe it does. Why not use that field?

By "ethernet device" I mean the ethernet interface - e.g. "eth0" or "eno1", I'm not referring to the IPv4 address. To my current knowledge, the Tins sniffer needs the ethernet interface name, and you can't initialize it with just the IPv4 address (see sniffer.h in case I missed something). I also don't know if it's possible to obtain the ethernet interface name knowing only it's IPv4 address. If either of those things are possible, please let me know! It would be nice to remove that parameter altogether.

I thought the purpose of this was to replay data, why are you needing to connect to a device at all?

Correct, the way SensorTins works is that it creates a Tins sniffer object, which sits on the ethernet interface you are playing live or replayed data through, and grabs packets that match the packet size, IPv4 destination, and port number. See the function I use for this - sniffOnePacket for more clarification. I've also added a section to the bottom of the readme explaining how the Tins driver is intended to be used.

This should be done in the Sensor classes implementation, not the larger driver itself if it is specific to the tins method

I think I've done that correctly...

I think this is what you meant in the "Additional Lidar Units" section of the readme, but please correct me if I've misunderstood it.

SteveMacenski commented 3 years ago

Got it, that makes sense. Yes, the parameter should be gotten in the SensorTins class, right now the sensor class doesn't appear to get a handle on the node but you could update the configuration function to also include a node shared pointer to get those params.

MattSYoung commented 3 years ago

Do you mean somehow pass a node handle through the OusterDriver class into the SensorTins class? So that SensorTins can ask for parameters that Sensor doesn't care about?

SteveMacenski commented 3 years ago

exactly. on on_configure you could give it the shared pointer of the node shared_from_this() and it can use then the node to get any parameters it likes that are specific to that specific sensor modality

MattSYoung commented 3 years ago

That would be very handy... do you happen to have any code examples of that working? I've tried similar things in the past, but almost always run into issues with ROS2 node executors complaining that "Node has already been added to an executor."

SteveMacenski commented 3 years ago

Are you trying to spin the now? You should be able to node->get_parameter() without an issue. Its only a problem if you're trying to spin it.

https://github.com/ros-planning/navigation2/blob/main/nav2_waypoint_follower/plugins/wait_at_waypoint.cpp#L34-L46

In this case, we used a weak pointer, but as long as you don't hold onto the shared pointer and you just use it to get parameters and then let the node go out of scope (e.g. do not store it as a class member) then you're good. We have full control here so that's OK. In the pluginlib context we can't control what users do, so we make sure we're safer with a weak pointer instead

MattSYoung commented 3 years ago

Are you trying to spin the now? You should be able to node->get_parameter() without an issue. Its only a problem if you're trying to spin it.

Not directly, but I've run into issues in the past where code I call is indirectly spinning or otherwise engaging with the node handle. Most of it seems to happen as soon as the ROS2 TfListener or TfBuffer class gets involved, so I hope we won't get that problem here.

on on_configure you could give it the shared pointer of the node shared_from_this() and it can use then the node to get any parameters it likes that are specific to that specific sensor modality

Ok, so I've looked into this and I think that for the most part this change would just require adding a const rclcpp_lifecycle::LifecycleNode::SharedPtr node, parameter to the configure function for the Sensor / SensorTins / SensorInterface classes. Then it's a matter of shuffling around parameter declaration and getting.

My concern with shifting a parameter getter to SensorTins::configure is that I would still have to declare it in the OusterDriver constructor, and that this would increase the number of places ROS params are declared or retrieved, increasing code complexity.

SteveMacenski commented 3 years ago

Yeah that means sense to me, no you can both declare and get the parameter in the SensorTins, there's no need to declare it in any place in particular. Inside of your sensor implementation is fine.

MattSYoung commented 3 years ago

no you can both declare and get the parameter in the SensorTins, there's no need to declare it in any place in particular.

I missed this earlier, but the reset function calls configure and you can't re-declare a parameter. So without introducing a flag or other toggle they have to be declared in a function that only runs once. I will leave parameter declaration for now in the constructor of OusterDriver.

I've moved the parameter getting to the sensor::onConfigure implementation as requested. You can see the changes in this commit: commit/39b9375.... Let me know what you think.

I also still have one bug to find and squash, so there are a few changes still in the works.

SteveMacenski commented 3 years ago

If you use has_parameter to check if its been declared yet, then you can only declare it if it wasn't prior, that way you can call configure again. This is a common workflow for lifecycle nodes. In fact in Nav2 we have a utility for it declare_if_not_declared() which does a

if (!has_parameter(XYZ)) {
  declare_parameter(XYZ);
}
get_parameter(XYZ);
MattSYoung commented 3 years ago

That sounds like quite a useful utility function, and would be worth using it here. Do you have a link to that exact function? It doesn't seem to come up my search result.

SteveMacenski commented 3 years ago

https://github.com/ros-planning/navigation2/blob/main/nav2_util/include/nav2_util/node_utils.hpp#L93-L125

MattSYoung commented 3 years ago

Thanks! Ah, the name of the function was slightly different and that's why I couldn't find it.

MattSYoung commented 3 years ago

I've shifted the sensor parameter setting/getting completely to the SensorX::onConfigure functions. I did this by addding your utility function basically as-is to a new utility header file - ros2_utils.hpp as a new utility file seemed like the best fit.

I've also discovered that the bug I've been chasing is related to (if not completely caused by) the throughput of the ethernet device. TL;DR if you have a standard 100 Mbit/s ethernet connection like I do, then it appears that tcpreplay can't regurgitate the LiDAR packets fast enough to hit the target cloud publication rate of 10 Hz. I'm using an OS0-128 for reference, which generates ~124 Mbit/s in 1024x10 mode and ~248 Mbit/s in 2048x10 mode.

While I search for an elegant solution to this problem, would you have an issue with me expanding the scope of this issue, and adding some more debugging code to the OusterDriver class? I'd like to add code to check the incoming packet rate against what is required for the configured mode, and warn the user if the packet rate is not fast enough.

SteveMacenski commented 3 years ago

given that you adapted that from Nav2, you should probably keep the copyright headers from Nav2.

I think it would be fine to add some more logging, but we should probably do that in another PR so that we keep these issues separate.

MattSYoung commented 3 years ago

given that you adapted that from Nav2, you should probably keep the copyright headers from Nav2.

Fair, fixed.

I think it would be fine to add some more logging, but we should probably do that in another PR so that we keep these issues separate.

Also fair, I guess because this ethernet throughput issue is going to affect every version of the driver it should be a separate PR. I'll start a PR for the current work then (edit: created PR #77) and create a new issue for the throughput problem.