reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
486 stars 97 forks source link

Changing robot to an center articulated four wheeled robot #117

Closed zonared closed 2 months ago

zonared commented 4 months ago

Hi there Reinis, thankyou so much for writing such a complete deep learning ROS system, it was pretty fun to watch the pioneer3dx robot learn how to navigate, then come back a day later and it was happily finding its goal, so cool.

So, being inspired by that, I thought I would try to load my robot into the sim, which for the most part I have (see below), but I need to tweak a few things in your DRL code but I'm not sure quite how. Firstly my robot interprets angular velocity as steering angle (like TEB local planner cmd_angle_instead_rotvel, will this effect things much? Also steering is limited +-0.5rad. Also my robot can drive backwards and I cant quite find the right place allow this.

How would all these changes effect the actor and critic weights and so on? Also I note that min_laser feeds into the rewards as well, but my robot is a rectangle, so a radius value might not work that well, maybe? Again, thanks heaps.

artmule_sim

reiniscimurs commented 4 months ago

Hi,

  1. You should be able to use steering angle in principle. The TD3 code does not change, you should just change the execution command in the env file. The TD3 architecture is not tied to how the action is executed and it does not know what action values actually mean.
  2. The actions are capped to ranges here: https://github.com/reiniscimurs/DRL-robot-navigation/blob/main/TD3/train_velodyne_td3.py#L344-L345
  3. For backwards driving you probably need to use different sensor FOV. The current implementation is hardcoded to 180 degree view in the env file. But you can update this as you want.
  4. For actor/critic weights short answer is - I don't know. I have not tried to train such robots and do not have insights there. Feel free to try out different parameters. I think at least in theory it all should work.
  5. You could try updating min_laser based reward function. But if you are still using collision based on a single min_laser value, I think it would be unnecessary. What this reward enforces is for the robot to keep larger distance from obstacles. And even in your case it quite easily be a circle.