Localization Module and TF fails due System Requirements?

marcusvinicius178 commented 3 years ago

Hi LG team. Good morning!

We recently discussed about system requirement to run AD stack modules properly: https://github.com/lgsvl/simulator/issues/1352

In this moment I am having issue on Perception module with Autoware.AI AD stack, as I have related here:

https://answers.ros.org/question/377659/occupangygrid-map-issue-ray_ground_filter-fails-to-generate-costmap-on-autowareai/

https://github.com/lgsvl/simulator/issues/1401

This issue avoids me to advance in Agent recognition and further test the obstacle avoidance algorithms.

Debugging these issues. I recognized that the Transform between Velodyne Frame and Base_Link is delayed. error

I suspect that this error lead to the wrong generation of costmap/occupancy_grid in Autoware.Ai.

For this reason I tried to use Apollo 5.0 AD stack. For my surprise I had similar error:

Captura de tela de 2021-05-05 11-55-52

"RAW IMU Message Delay. IMU message is 483 cycle 4.83649 sec behind current time."

This error start after launching the Localization module (with TF module already triggered) But now the error is with the IMU, instead of Velodyne (In Autoware.AI).

Do you believe these TF errors are related to the system requirements we discussed previously? I am not sure because my machine was able to simulate an autonomous driving from point A to point B in Autoware.AI and also with Autoware.Auto with all the basic modules triggered (Map/Sensing/Localization/Detection/Mission_Planning/Motion_Planning) However, I confess, the terminal often sent me warning messages that the TF was experiencing issues...

If the reason of this issue is the GPU/computer power, etc could you please explain this error and the relation it has with the system requirements? And why just TF and LOCALIZATION module are the mainly affected? I did not understand, what happens under the processing.

Is there a code way to get around this error ? I mean working on TF time as this link suggest http://wiki.ros.org/tf/Tutorials/tf%20and%20Time%20%28C%2B%2B%29 ? Or I would lose my time in this effort?

Thanks in advance.

EricBoiseLGSVL commented 3 years ago

Have you tried the clock sensor with Apollo 6.0 yet? @hadiTab doesn't autoware support the clock sensor as well?

marcusvinicius178 commented 3 years ago

Hi @EricBoiseLGSVL Thanks for the Answer.

I believe the clock sensor is enabled for Autoware. Once I triggered in terminal with the cmd: $ rosparam set /use_sim_time true However its use, generates a lot of issues into the Localization module, making the localization fails. I am not sure if this happens just in Autoware.AI or also in other AD stacks...

I still did not tried the Apollo 6.0 My attempt was using the Apollo 5.0. I think it may be a good idea upgrade to 6.0. I hope the clock sensor works for Apollo and solve this TF/Localization issue.

But in case of Autoware.AI and this issue: https://github.com/lgsvl/simulator/issues/1401

I believe that the error probally is on ray_ground_filter. Either in the launch file: ray_ground_filter.launch or the ray_ground_filter.cpp nodes and/or ray_ground_filter.h library

Because debugging with RQT tool, I can see that this node is receiving data from config/ray_ground_filter topic but is failing in processing it and therefore fails to publish on topic /points_non_ground. That is the topic responsible to say: "Ok Costmap generator there is a car over the ground, it is an obstacle! Please make the cell of occupancy grid map Busy, and assign 1 to it!"

What I still did not figure out is if the DELAY in velodyne to base_link Transform is the cause that leads the node ray_ground_filter does not work properly. Or if the code has a bug!?

And if Delay on TF be the problem, I am convinced that at least for Autoware.AI it cannot be fixed with the clock sensor. And I wonder if this is an issue of system requirements? ( I think it is not, because I used my powerful notebook, and my relative's desktop to test this, both isolated with teamviewer, and the TF keeps delayed. I am going to test both working together tomorrow to see if the TF behaviour changes). If not I believe the way to fix is to try out some TF tutorial trick, or go line by line in Autoware core perception module look for the error and fix it.

A complete debug video for TF error in Autoware.AI is below: https://youtu.be/QiV0g0kE2Z8

If you think in other possibilities or solutions that saves effort, please let me know :)

EricBoiseLGSVL commented 3 years ago

This is a great bug to find, please post anything you turn up. I'll ping @hadiTab so we can look into as well. Thanks for the issue.

marcusvinicius178 commented 3 years ago

No worries. I thank you very much again for the support and attention. I need to fix this anyway in order to do not f*** my master :( hehehe. I am considering hire an expert on ROS/C++ to support me on fixing this code (if be the problem, because I am good in Python but limited on C++): https://github.com/Autoware-AI/core_perception/blob/master/points_preprocessor/nodes/ray_ground_filter/ray_ground_filter.cpp

Would you have a name suggestion? Any help and ROS/LG trick is very welcome and can forward an email to: marcusvini178@gmail.com Thanks in advance!

ABOUT APOLLO:

@EricBoiseLGSVL in addition to Apollo now...I tested Apollo on my notebook. I have almost done the self-driving maneuver as Steve did here: https://www.youtube.com/watch?v=Ucr0aM334_k&t=2266s

However in my case the car did not drive alone, and got stuck: https://youtu.be/xUSJKKMeUBU

I believe that because of system requirements, considering I have repeated the same steps as Steve Lemke. But regarding The IMU warning, it seems that does not matter, once Steve also had this warning is his tutorial video on youtube. Anyway I will try run both AutowareAI and Apollo AD stacks this week with 2 connected machines. Afterwards I will let you know if this solves the issues or not.

marcusvinicius178 commented 3 years ago

Hi @EricBoiseLGSVL after rent a very powerful machine on paperspace.com and runned the simulation with distributed system (desktop + notebook). The TF stopped to fail as well as localization module and therefore the occupancy_grid map was SOMETIMES Correctly generated on Autoware.AI perception module.

Yes the perception module from Autoware.AI just works with a lot of GPU resources allocated to it. However it is also not robust and often does not work properly and the Lidar data does not get updated.. Maybe this can be considered a bug from my point of view.... That is why I am using Apollo.Auto project now. The perception module never fails, it is much better!

You can check this module failures on video below: https://www.youtube.com/watch?v=9gbq60B6Yb0&list=PL_d_YiRmUA3nnTRmF8QJKyPoEQhrdvpP2&index=20

And check the complete problem description here, if interested:

https://answers.ros.org/question/377659/occupangygrid-map-issue-ray_ground_filter-fails-to-generate-costmap-on-autowareai/

https://answers.ros.org/question/377308/autowareai-perception-failure-agent-not-detected-in-rviz/

https://answers.ros.org/question/377659/occupangygrid-map-issue-ray_ground_filter-fails-to-generate-costmap-on-autowareai/

EricBoiseLGSVL commented 3 years ago

Yes, Autoware.ai has some issues. Great work with the videos describing the problem.

lgsvl / simulator

Localization Module and TF fails due System Requirements? #1412