Closed Michael-Equi closed 4 years ago
@Michael-Equi can you tell me what Lidar you are using and what driver?
I am facing the same issue with my Slamtec RPLidar A3 uising their ros driver.
The output on the /scan topic shows:
header:
seq: 2543
stamp:
secs: 1597007158
nsecs: 82151873
frame_id: "rplidar"
angle_min: -3.12413907051
angle_max: 3.14159274101
angle_increment: 0.00435422640294
time_increment: 5.05294083268e-05
scan_time: 0.0727118179202
range_min: 0.15000000596
range_max: 25.0
Per the calculations in the Slam Toolbox Karto.h it should be sending 1439 readings per cycle:
GetMaximumAngle(): 3.14159
GetMinimumAngle(): -3.12414
(GetMaximumAngle() - GetMinimumAngle()): 6.26573
GetAngularResolution(): 0.00435423
(GetMaximumAngle() - GetMinimumAngle()) / GetAngularResolution(): 1439
GetIs360Laser(): 1
residual: 0
math::Round((GetMaximumAngle() - GetMinimumAngle()) / GetAngularResolution()) + residual: 1439
but the /scan topic always has 1440 readings on it, not 1439.
This didn't happen before when the code to detect a 360 degree lidar was not setting residual
to 0, but I guess that was a bug.
I'm digging into the RPLidar driver now to see why the published angle increment, maximum angle, and minimum angle does not also result in a number of readings that Slam Toolbox expects.
To recap
Per the published data from the Lidar driver:
3.14159274101 - -3.12413907051 = 6.265731812
Total angular area of coverage, but:
1440 * 0.00435422640294 = 6.27008602
while if it published the number of readings that Slam Toolbox expects, it would agree with the published scan area:
1439 * 0.00435422640294 = 6.265731794
A simple patch to the RPLidar driver could hack this out, but I'm taking my time to understand it to see if there are other problems with it rather than just hacking this. Meanwhile, just using the old commit of Slam Toolbox is a temporary workaround.
@chrisl8 I'm currently just using the old commit. But yes I get that problem both with the lidar in simulation as described above and with the RPLidar S1 using this ROS2 port of the driver.
@Michael-Equi I would be interested to know if you have the same results with this ROS2 port: https://github.com/ManuelMeraz-Forks/rplidar_ros
He has made some changes to the "angle compensation" code, and I wonder if it fixes this or not.
I have not been able to test it yet, as I do not have a ROS2 environment set up. I'm still using Melodic and Noetic.
I think as we learned, the rp lidar drivers are problematic. Any action items here?
@SteveMacenski Could you point me to the ROS or whatever documentation explains what the "residual" reading is and why "360" lidar units should not have them?
As it stands, if the RPLIDAR was treated just like the Hokuyo, I think it would work fine, but it is treated differently, and I'm having trouble understanding why.
Understanding how and why it is treated differently will help me to follow up with driver update PRs for the RPLIDAR driver.
I am going to have to update code somewhere to make the RPLIDAR and Slam Toolbox happy together. My guess is that Slamtec will ignore my PR, but the ROS2 driver maintainers (who are considering, but not yet decided on, taking over that driver from Slamtec for the ROS2 package) will likely accept a PR if I can explain it properly.
Thank you!
None, that was a user submitted patch for supporting 360 lidars. See the PR.
I think as we learned, the rp lidar drivers are problematic. Any action items here?
I disagree. What we learned was:
There was a suggestion that the sweep direction of the RPLIDAR driver output was wrong, but it is correct.
None, that was a user submitted patch for supporting 360 lidars. See the PR.
The RPLIDAR angle increment calculation is the same as that used by the Hokuyo, but the Hokuyo is not caught by this code,
instead 1
is added to the expected number of readings from the Hokuyo, and any other lidar other than the RPLIDAR and presumably the Velodyne.
This special handling of "360" lidar units was introduced in response to Issue #141
However, it initially did not affect the RPLIDAR output. Later PR #261 was made to cause it to apply to the RPLIDAR as well.
My suggestion is to revert this commit.
It apparently worked for one user, but it breaks Slam Toolbox entirely for at least two of us.
If exposing some parameter to turn this on/off would be preferable, I can make a PR for that instead.
but the Hokuyo is not caught by this code
If its the exact same calculation in both drivers, then there should be no issue with either or the same issue with both. Why are they different?
My suggestion is to revert this commit.
I'm not going to revert a commit to break one sensor in favor of another. We should find the issue w.r.t. the RP-lidar users whether its here or there and resolve that so that both work. If anything, I prefer support for the Hokuyo over the RP-lidar as a professional unit.
This is honestly a bit frustrating:
Remember, I asked:
Could you point me to the ROS or whatever documentation explains what the "residual" reading is and why "360" lidar units should not have them?
and you responded:
None, that was a user submitted patch for supporting 360 lidars. See the PR.
The code literally adds 1
to the reading count for some lidar input and does not add 1
for others, entirely based on the field of view (FOV) of the lidar, and there is no justification given for this distinction at all, other than somebody once with a Velodyne said it didn't work for him:
In the Velodyne issue it seemed like you were faulting the Velodyne lidar driver:
I mean regardless of any logic around this package, that package, any package, we can agree that 2*pi / 0.007 != 897, which means the velodyne drivers are wrong. Flooring a partial measurement (0.59!! not even 0.01 or something negligible) is poor engineering and clearly at minimum a partial cause of this issue, as well as messing up its own internal operations.
However, in the end, you added a change to accommodate it anyway.
That change, once it finally hit the RPLIDAR, has broken it.
I'm not sure why the RPLIDAR now has the burden of proof to not be treated differently rather than the Velodyne.
I'm not asking for special treatment for the RPLIDAR, rather, I'm asking for it to be treated the same as every other lidar, instead of falling into this special case originally set up for the Velodyne.
If anything, I prefer support for the Hokuyo over the RP-lidar as a professional unit.
YES! That is what I am suggesting: Just treat the RPLIDAR exactly the same as you do the Hokuyo instead of making it a special case that is treated differently because of its field of view (FOV).
I have no problem finding a solution that will work for everyone, but you have to understand my point of view here. I've been working with SICK, Hokuyo, and other professional lidars with this software package for years without any of these issues. All the sudden these RP-lidars show up and I'm inundated with users saying it doesn't work and we dig through and the more we dig the more issues we find the RP-lidar driver stack. Not following REP-103, frames not following REP-105, weird parameters no one seems to understand (angle compensation), etc. When no one else says they have problems, its really easy to point at RP-lidar and say "that's the problem". Especially when users say they do some weird hacks to make it follow these REPs and then it magically works.
I wouldn't disagree that there's a potential that when I blame the RP-lidar / associated drivers, there might be some false positives where there really is an issue here and I attribute to the RP-lidar because historically its had so many issues causing really subtle defects. I don't believe I've ever come back and said that I wouldn't merge in a solution for some of these issues as long as they don't break anything else. I've probably just said that its not something I'm prepared as a maintainer to take a serious effort to debug myself at this time.
Even the fact that I regularly respond within hours to comments and get PRs merged is abnormal for most maintainers, so please give me some slack. It's not my job necessarily to fix your issues, but I really do spend more time than I have to in helping make it happen and giving you my professional expertise with breadcrumbs when I can't sit down and dive into it myself. I want you to succeed and I'm trying to help you do so in the way that I can. I have about 50 open source packages in ROS and ROS2 that I am the only or primary maintainer on, if we count distributions separately, its well over 250 version of things I have to be aware of at all times. That is a massive time burden and I have to pick and choose my battles. And I also need to keep a day job going on top of that which requires developing new things, not just maintaining existing ones.
So please, its us against the problem, don't make it confrontational. I want RP-lidar users to succeed as much as anyone else. However, we've found some weird things that they're not fixing in their drivers and until that's resolved, issues that sound like "huh, that's weird why hasn't anyone else reported this yet" get some additional scrutiny from me & information gathering (which if you look above, is exactly what I'm doing here, trying to understand the scope of the problem). If it doesn't sound like a driver issue or multiple vendors of people report it, then that really gets my attention because then its clearly a bug which its on me to help resolve.
other than somebody once with a Velodyne said it didn't work for him
The velodyne drivers were made by people I know and deeply respect. Not only that, its gotten a really nice face-lift to make it functionally real-time and being used in Autoware for autonomous driving. If there's a single ROS driver in the world I trust to be accurate, its that one. So you can forgive me for getting a report that the velodyne was giving users problems as a 'word of god' case. Especially with a user with a really rational and reasonable solution.
I do treat 360 as different because they genuinely are. If you have a 270 degree lidar at 1 degree resolution, you clearly have 270 measurements, right? Wrong. Its 271 because you have the 0 and 270 inclusive angles. That's why the original library has a +1
for all laser scanners that the Velodyne user very succinctly found. If you have a 360 lidar, do you have both a 0 degree and 360th degree scan point? No, because that would be scan point 1 for the next rotation, so you don't have a +1
. That makes intuitive sense from just a geometry standpoint and also fixed one particular vendor of 3D lidars. I also tested this on my Ouster 3D lidar and it worked smashingly so that gave me 2 professional 360 lidar datapoints that this was the correct move.
Merging that PR wasn't random or poorly thought out. What would be though is reverting something that is working for a majority of users because a single manufacturer might be having some issues with properly defining their laser scan messages, which would definitely cause this exact error. I don't know (yet) if that is the reason we see this error, but its something worth exploring at length before I go and beg / borrow / steal a bunch of 360 lidars to do some testing myself.
Back on topic...
1439 * 0.00435422640294 = 6.265731794
You yourself agree that the correct value is 1439 and your driver is producing 1440, assuming that the minimum, maximum, and angle increment can be trusted. There's 2 ways of them approaching this:
This library messes up the calculation, which you agree is not the case.
Both would have you chase down two different ways, but I think ultimately what you may find is that the RP lidar is returning the 360 point as a mirror of the 0 point or something similar so there's an extra value that shouldn't be there. Or I suppose that the angle increment is bogus. Both would actually have non-trivial outcomes in the map & could potentially have some issues if the optimizer doesn't like the fact that there's some really small orientation perpetration.
https://github.com/BytesRobotics/rplidar_ros/blob/ros2/src/rplidar_node.cpp#L325 for right here, the total number could easily just be wrong because you're hardcoding it assuming the laser does what you think it does rather than checking for the number of returns. I'm not sure how that might manifest in terms of the last value if that's not valid. Pull from random memory? 0? same as index 0? I'd be curious if you even noticed a difference if you changed the MAX_SAMPLE_COUNT to 1439.
@Michael-Equi for simulation, you can control the number of samples and ranges and such - https://github.com/ros-planning/navigation2/blob/master/nav2_bringup/bringup/worlds/waffle.model#L182-L207. That's an example of how we do this for the nav2_system_tests. Are you only seeing this on your simulation attempts or also in your hardware?
For your sim, I see there's supposed to be 667 samples, where did the 359 and 360 numbers come from? Is that from an old attempt and the configuration you sent vs that error are from different configuration attempts? For the numbers you list 6.283185307/0.009424778
(where 0.009 is 0.54 deg in radians) = 666.66 samples. You should make sure that these are more exact. E.g. 667 would be 0.5397301
rather than 0.54. Small variations in radians can make a big difference with small ranges.
@SteveMacenski I apologise for my tone of frustration. There is no excuse for that.
I had a little whiplash when several issues were sorted out, everything was working smoothly, and then that one PR completely broke support for my hardware.
I am extremely grateful for the work you have and do put into this. Extremely! I could go on about how wonderful Slam Toolbox is and how amazing your support of it is all day. I've told several people that I think ROS and open source robotics are about to experience an explosion in usability and popularity due largley to exactly your work here.
You have provided a wealth of information and clarification in your responses though, for which I am grateful. I will reply on the topic at hand after I've had time to digest all of this, but I didn't want to leave any doubt that I'm deeply grateful for the work you put into this, and that I am aware that you respond to issues possibly faster than anyone on GitHub!
I read through all of this a few times, and here is my thought process, and suggestions. Please correct any assumptions I've made about ROS and the expected lidar output format.
I hope you will forgive me for not thinking your felt the Velodyne was a "'word of god' case" when your actual words to them included "the velodyne drivers are wrong" and " is poor engineering".
It helps to know the Velodyne driver is considered a standard, as I have been looking for standards to reference.
I will take this as my answer to why the Velodyne case is considered a standard setting issue.
This next quote is exactly what I was looking for originally, I'm not sure why I wasn't able to ask the right question before to get this answer. I often fail to ask the correct question. This is what I was asking for, thank you:
I do treat 360 as different because they genuinely are. If you have a 270 degree lidar at 1 degree resolution, you clearly have 270 measurements, right? Wrong. Its 271 because you have the 0 and 270 inclusive angles. That's why the original library has a +1 for all laser scanners that the Velodyne user very succinctly found. If you have a 360 lidar, do you have both a 0 degree and 360th degree scan point? No, because that would be scan point 1 for the next rotation, so you don't have a +1. That makes intuitive sense from just a geometry standpoint and also fixed one particular vendor of 3D lidars. I also tested this on my Ouster 3D lidar and it worked smashingly so that gave me 2 professional 360 lidar datapoints that this was the correct move.
The size_t count = 360 * 8;
line in the RPLIDAR driver is the developer initializing the array for the readings with an "absolute maximum" value. That count is never actually used.
It may not be the best C++ code, but it doesn't affect the results coming out of the driver.
The RPLIDAR hardware does not return the same number of values on every rotation. It also returns an angle for each measurement it returns, because they are not guaranteed to be equally spaced, or at the exact same point each time. The spinning motor may be at a different point on each rotation, so it returns the exact angle for each measurement.
None of this is what ROS wants though.
This is all reasonable, but just not how the RPLIDAR mechanically functions.
That is where the poorly explained "angle compensation" comes in. The angle compensate code takes all of the inputs, and spreads them evenly across the known FOV of the unit using an imaginary target "resolution", based on the RPLIDAR hardware's reported operating resolution.
Some refactors of the driver get the exact first and last angle on every update, and use those instead of a "fixed" start and stop point, however
that causes the angle_min and angle_max to change on every update, which I don't think most mapping tools even look at after the first packet. Correct me if I am wrong on this.
However, the difference between the hard coded min/max and the reported min/max form the hardware is fractions of a degree. Too small to matter, and in my testing, this didn't change the mapping at all.
In fact, I played with all sorts of things and it is hard to find a change that affects the output in a meaningful way.
At the end of the day, the output just needs to comply with what ROS expects, which it does as far as I can tell, aside from this one thing now that other mapping tools weren't checking for.
So, my feelings based on testing and reading the RPLIDAR driver and some refactoring of it:
@SteveMacenski I think you hit on the right answer with
I'd be curious if you even noticed a difference if you changed the MAX_SAMPLE_COUNT to 1439.
However, the count is part of a calculation based on the particular RPLIDAR's reported resolution at the time (it is different for different models and even for different modes for the same model), and I hate to mess with that code that works.
So I suggest going the other way around and telling RPLIDAR to use all 1440 (or whatever it is at the time, because different RPLIDAR models and modes result in a different count) readings when calculating the angle_increment:
https://github.com/chrisl8/rplidar_ros/commit/f21079fea8eca8946b5b4ae72f50b8d9f1ac46a2
Replace:
scan_msg.angle_increment = (scan_msg.angle_max - scan_msg.angle_min) / (double)(node_count-1);
with
scan_msg.angle_increment = (scan_msg.angle_max - scan_msg.angle_min) / (double)(node_count);
I don't even know why the RPLIDAR developer subtracted 1 from the count to begin with.
My experimentation shows that this works fine. As Steve suggested, the difference isn't actually noticeable. If somebody can demonstrate that a given frame from the RPLIDAR, published this way, results in failed mapping, I would be happy to analyze the .bag output and suggest a better solution, but I think this is good enough.
My justification for presenting this PR to the RPLIDAR driver maintainers will be Steve's confidence in the Velodyne driver and his explanation of why a 360 lidar should report differently than a non-360 lidar.
Sorry for being so long winded. I posted my thought process so that you can critique it if you see flaws, but do include details with any critiques.
I don't even know why the RPLIDAR developer subtracted 1 from the count to begin with.
Yeah, me either. Thanks for cleaning it up.
I'm pretty sure most mapping programs do not like to get a different angle_min, angle_max, angle_increment, and reading count on every update to the /scan topic, which is understandable.
More than that, I'm not sure that some will even function correctly with that kind of treatment.
However, the difference between the hard coded min/max and the reported min/max form the hardware is fractions of a degree. Too small to matter, and in my testing, this didn't change the mapping at all.
Who knows, this, or other derivatives of the data not being consistent each rotation could be the cause of things like in #281. That's why I want someone to find a reproducable way of causing this so that we can test with non-360 lidars or more consistent lidars (like from simulators) if we can even get it to happen.
https://github.com/Slamtec/rplidar_ros/pulls The RPlidar maintainers don't look active, I'm not sure I'd expect this to be merged anytime soon. They've never merged a pull request. You may want to try emailing them.
It may be worth forking a version of this onto ros-drivers and have that supersede this driver, but I'm not going to maintain it so someone else(s) would have to volunteer.
@Michael-Equi Does https://github.com/allenh1/rplidar_ros/pull/18 solve this issue for you? It does for me, but I didn't open the issue. :)
I think @allenh1 has taken "ownership" of the RPLIDAR driver for Foxy. I'm not sure if anybody is doing the same for previous ROS releases. The person who opened this issue is using Foxy, and I think they are using the RPLIDAR driver hosted by @allenh1
@SteveMacenski I agree, this opens up further concerns, but they might best be tracked on their own issues if we have a working workaround for this one?
I don't understand the question.
Sorry, never mind.
This issue is solved for me by the patches listed above.
Further, I have updated my comment on the end of the pinned RPLIDAR issue to reflect this solution: https://github.com/SteveMacenski/slam_toolbox/issues/198#issuecomment-668678121
Required Info:
Steps to reproduce issue
Using the simulated 360 degree scanner with the following urdf, run the online async slam toolbox node.
Expected behavior
A map should begin being built as soon as the node starts and should not emit the error
error [async_slam_toolbox_node-1] LaserRangeScan contains 360 range readings, expected 359
.Actual behavior
Mapping does not start correctly and the aforementioned error is repeatedly printed.
Additional information
I have narrowed it down to the
is_360_lidar
variable and the following lines https://github.com/SteveMacenski/slam_toolbox/blob/foxy-devel/src/laser_utils.cpp#L108-L111. Before commit 6dcab997bb531d695fca401ee3f17b2298ef1f46 the variableis_360_lidar
was set to false which seems to prevent this issue. After the commit, the sensor was correctly identified as a 360 lidar but that causes the above behavior to occur and therefore break the SLAM system.