v-kiniv / rws

WebSocket gateway for ROS2 topics and services
Apache License 2.0
10 stars 7 forks source link

Socket throttle_rate support #42

Closed MoffKalast closed 1 month ago

MoffKalast commented 1 month ago

So I've noticed that rws doesn't support the throttle_rate param from roslibjs, which can be kind of key when trying to send over large high rate topics like pointclouds and laserscans without clogging websockets over slow wifi, plus it should result in some minor CPU usage reduction in some cases. I think this is the last feature that's missing in rws compared to rosbridge.

I'm not all that well versed in cpp, but I've done this exploratory implementation to see what it would take to get it set up. It does seem to work from what I've tested it, but it likely needs some adjustments (or should be implemented some other way entirely), test coverage and a check to see what happpens in foxglove protocol mode.

v-kiniv commented 1 month ago

Looks good to me, I tested throttle feature with Vizanti and the basic functionality in Foxglove(there's no throttle option), the throttle works and nothing seems to be broken. Regarding tests, I set up CTest with VSCode(CTest CLI is horrible) in Ubuntu VM, before I only had working setup on a Mac, but now my ros2 is broken there. While running the tests on Linux I noticed that one of the tests failed with a ~20-30% chance. I fixed the issue in https://github.com/v-kiniv/rws/pull/43. Outside of the tests the issue never occurred though. I also fixed CMake linter tests and compiler warnings, so you may want to pull the changes before writing tests for the throttle feature. If you don't plan to write tests, we can merge it now and I'll add them later.

MoffKalast commented 1 month ago

Hmm well I've attempted to add a test that sends multiple messages and then expects that to be throttled to one, but ctest seems to silently hang when it gets to the subscription_callback part and I can't figure out why. Anyhow I'm not sure if it's all that crucial that this specific thing is tested, I just wanted to get a bit familiar with the setup. If you have any quick fixes for that then feel free, otherwise it's fine by me to remove it and merge.

v-kiniv commented 1 month ago

I just wanted to get a bit familiar with the setup

Ok, so I'll be more verbose in explaining what needs to be changed and why.

The test hangs because you send an empty serialised message(basically invalid) to a ClientHandler, which tries to decode it and crash. Empty serialised message works ok in Connector tests, because there the messages are not processed in any way, pointers just passed around, and that's what the Connector tests check, if pointers are valid and match, it doesn't care about the content of the message.

Now what needs to be changed:

I've tested the changes on Jazzy+Vizanti and Foxglove. Please test and let me know before I merge this to Jazzy and backport to other versions.

MoffKalast commented 1 month ago

Re-tested the changes, seems to work well :+1:

The test hangs because you send an empty serialised message(basically invalid) to a ClientHandler, which tries to decode it and crash.

Ahh alright, I would've thought that the test runner might propagate any crashes up, but I guess that doesn't really happen, haha.

Move the throttle logic to Connector, because it is responsible for routing messages to clients

Yeah agreed, the more overhead we can remove from parsing messages that won't be sent anyway the better.

In theory we could just skip even receiving those entirely by unsubscribing for the duration of the throttle if it's larger than some notable fraction of a second and save on CPU usage, but I doubt it'll make much of a difference unless it's like a 20hz 15k point laserscan throttled to 500ms and similar, it would be tricky to implement reliably, there is some overhead in doing it.... and it might trigger multicasts on each subscribe with some DDS setups which would be rather counterproductive if this was all to reduce network load.

Replace std::chrono with rclcpp::Time and rclcpp::Duration

Well the reason I went with chrono there is the invariance to the ROS clock, since unless it's a simulated network the sockets will always run in realtime, and when reviewing bags at say 20x realtime rate or running Gazebo at 5x speed if the throttling is dependent on that it'll cause issues. Though as you say, the invariance becomes a problem for running tests.

Though I think unless the bag is specifically set up to publish --clock and RWS is launched later it should still work ok. I've briefly tested with a sped up bag now and it seems to throttle it alright, but I don't have any Gazebo setups on Jazzy yet (nor have I got it installed on a machine that could run it faster than realtime lol), so we can merge it for now and see if it's actually a problem once we can test for it.

waiting/sleeping in tests is a bad idea

Yeah I attempted doing that briefly which also crashed in the background or ctest timed out or something :stuck_out_tongue:

v-kiniv commented 1 month ago

so we can merge it for now and see if it's actually a problem once we can test for it.

Yeah, let's merge and test, if it'll cause problems, we can replace ROS time.

Thanks for the contribution!