ros-perception / laser_filters

Assorted filters designed to operate on 2D planar laser scanners, which use the sensor_msgs/LaserScan type.
BSD 3-Clause "New" or "Revised" License
174 stars 205 forks source link

Feature: ros2 port for all filters #189

Closed berend-kupers closed 3 months ago

berend-kupers commented 10 months ago

Hello,

I have added ROS2 support for the following filters, which did have support in noetic-devel:

Key changes

Benefits:

Functionality Status:

Note: I want to mention two issues that I've noticed in the already ported laser filters, which would be best to fix in the filters package.

  1. In box_filter and speckle_filter a (lifecycle-) node is created to create a tf listener (box_filter) and to create a on_set_parameters_callback (speckle_filter). When you start a filter chain and rename the node via launch files, it overrides the name for all node instances. So you get the error:

Publisher already registered for provided node name.

  1. Dynamically setting a parameter doesn´t seem to work, even though the speckles_filter does have an on_set_parameters_callback implemented. If I use ros2 param set ..., I get the error:

Setting parameter failed: parameter 'filter2.params.filter_type' cannot be set because it is read-only

I think these parameters are set to read-only in the filters package.

Please let me know if you have any feedback. I would be happy to make any necessary changes.

Kind regards,

Berend Kupers Lowpad

bjsowa commented 4 months ago

What's the status of this PR? @jonbinney Could you please merge this? I'm interested in using the polygon filter but I'd rather avoid copying the whole implementation to my project.

jonbinney commented 4 months ago

@bjsowa sorry i haven't had time to review and merge this! Are able to try out this branch and give some feedback?

bjsowa commented 4 months ago

I tested the polygon_filter and it seems to work after implementing the fixes from my review.

However, I found a general flaw in how some of the filters are implemented. Only the logging and parameter interfaces are passed from the main node to the filters. Thus, any filter that needs to add some Subscriptions or Services, is implemented as LifecycleNode. The problem is, the filter nodes are not spun by any executor, thus the subscription or services callbacks are not called.

The result of it is that, for example, the base_footprint_exclude subscription, as well as, parameter and lifecycle related services in laser_scan_polygon_filter do not work.

IMO, all filters should be implemented as a Node (perhaps rclcpp::Node instead of LifecycleNode), either spawning their own spinning thread like the tf2 transform listener or created as the main node's sub-node

berend-kupers commented 4 months ago

@bjsowa I implemented your comments.

As for your comments about the lifecycle node. I tried to stay close to the implementation of the other filters, e.g. box filter and indeed noticed that for example the parameter reconfigure doesn´t work. I thought it would be out of scope for this PR, but agree it needs to be fixed.

In our code, I solved it by forking the filters package and passing a pointer to the node instead of just the param/logging interfaces. That works for us, but would introduce a breaking change. I can try to get one of your suggestions working, but as I said, I think that should be a separate PR

bjsowa commented 4 months ago

As for your comments about the lifecycle node. I tried to stay close to the implementation of the other filters, e.g. box filter and indeed noticed that for example the parameter reconfigure doesn´t work. I thought it would be out of scope for this PR, but agree it needs to be fixed.

Agree, it can be addressed in future PRs. @jonbinney I have no more feedback.

jonbinney commented 4 months ago

Thanks for taking a look @bjsowa !

I've got some time to review this PR this week. It looks like tests are failing on some EOL ROS2 distros, so i've updated the CI here: https://github.com/ros-perception/laser_filters/pull/196

jonbinney commented 4 months ago

@berend-kupers could you rebase onto current head? Then the tests will run on the updated list of distros. Thanks!

jonbinney commented 4 months ago

(or merge ros2 branch into this one, whichever you prefer)

berend-kupers commented 4 months ago

@jonbinney Done

jonbinney commented 4 months ago

Looks like the speckle filter test is failing on CI, and I have the same issue when I run locally. It looks like an error during rclcpp shutdown - do you know why this is? Maybe because of the multiple nodes with the same name you mentioned in your Note 1 in the PR description?

jonbinney commented 3 months ago

I've been looking into what changes would be needed to the filters library to avoid needing to create extra nodes. I think a few things:

Add protected member function to FilterBase to get the NodeLoggingInterface. This would enable logging in the filter. (this should be an easy fix)

Either of the following changes for parameters:

  1. Add protected member function in FilterBase for filters get the NodeParametersInterface
  2. Add ability in FilterBase for filters to mark parameters as writeable and add protected member function in FilterBase to add parameter setting callbacks. The first option is easier to implement, but is a bit weird because I don't think there is a way to create a NodeParametersInterface thatalways applies a given prefix, and so the filters would each see the parameters for all the other filters. Their callback functions would also be called anytime the parameters are changed for any filter.

Add a transform listener somewhere... and give filters access to it.

OR we could say forget all of this careful passing of only the interfaces needed by the filters, and provide them a getter() for the entire Node object. This seems quite ugly, although i guess it isn't as ugly as using a singleton node in ROS1.

OR we could go crazy and make each filter its own Component which can all then be loaded dynamically or something.....

In any case, I'd like to get most of this PR merged before any of that stuff happens. These filters are quite useful for people, even without some of their features. @berend-kupers what do you think about removing the dynamic reconfiguring of these filters for now, as well as removing the ability to have two different frames in BoxFilter? Then we can review and merge this PR and keep on with the longer discussion of how to add the extra functionality to the filters package?

jonbinney commented 3 months ago

Still some tests failing - i should have time to debug them later this week to see what is going on.

jonbinney commented 3 months ago

ok i think i know why the tests are failing intermittently. In test_speckle_filter.test.py the subscribers are created, then the publishers, then the raw laser scan is published, then we wait for the filtered scan. But in the background, asynchronous network stuff is happening to actually hook up the subscribers. So even though the subscribers are created "first" sometimes they don't actually connect until after the filtered laser scan has already come back. In that case we wait forever.

I added

        rate = node.create_rate(1)
        rate.sleep()

after node.start_subscribers() and before node.publish_laser_scan(), and after that i was able to run the test a dozen times with no failures. This is too hacky though..... ideas on a better fix?

berend-kupers commented 3 months ago

Shouldn't making the publisher transient local do the trick?

Otherwise, I could either

jonbinney commented 3 months ago

By "transient local" do you mean make it a local variable? I'm not sure how that would fix this. I like the idea of publishing the scan at a fixed rate though, that sounds simple and robust to failures.

berend-kupers commented 3 months ago

I meant setting the quality of service setting durability to "transient local" for the publisher. Then the publisher should persist samples for "late-joining" subscribers.

But I'm fine with publishing the scan at a fixed rate as well

jonbinney commented 3 months ago

I meant setting the quality of service setting durability to "transient local" for the publisher. Then the publisher should persist samples for "late-joining" subscribers.

Ah, perfect! I'm not familiar with the ROS2 quality of service settings yet, so I just learned something :-)

jonbinney commented 3 months ago

Interesting - a couple tests still failing but I'm having a harder time reproducing it locally. Debugging now.....

jonbinney commented 3 months ago

Looks like you changed the QoS on the publisher created in the python test node - that's not the one that's causing the problem. We're missing the return message published by the filter chain and received by the python test node. I tried changing the QoS in the subscriber in the python node, but it failed (something about the underlying middleware or the options used in the filter chain).

The other fix you suggested (publishing at a fixed rate until we get a reply) does work. I've implemented that and sent a PR to your PR branch - could you review and merge that? :-)

If you're curious, here is the command I used to run the tests repeatedly to reproduce the intermittent failure:

colcon test --retest-until-fail=100 --packages-select laser_filters --event-handlers=console_cohesion+ --ctest-args --output-on-failure; colcon test-result --all
jonbinney commented 3 months ago

Awesome, tests pass! I'll do a code review tomorrow to make sure nothing major is wrong. Since these filters weren't in ros2 at all before, I don't plan on nit-picking about small changes - those can be fixed in smaller PRs later.

jonbinney commented 3 months ago

Thanks @berend-kupers, and also @bjsowa for reviewing. Sorry it took me so long to get to!