reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
488 stars 98 forks source link

Different ways of obtaining laser_state in def step and def reset? #4

Closed AgentEXPL closed 2 years ago

AgentEXPL commented 2 years ago

In def step, the following scripts are used to obtain the laser_state. The laser_state seems to be only related to "self.velodyne_date" which is updated with the velodyne_callback function of the subscriber of "/velodyne_points". And "laser_state = np.array(data.ranges[:])" seems to be useless here.

    data = None
    while data is None:
        try:
            data = rospy.wait_for_message('/r1/front_laser/scan', LaserScan, timeout=0.5)
        except:
            pass

    **laser_state = np.array(data.ranges[:])**
    v_state = []
    v_state[:] = self.velodyne_data[:]
    laser_state = [v_state]

However, in def reset, the following scripts are used to obtain the laser_state. The laser state is obtained by waiting for the message of '/r1/front_laser/scan'.

    while data is None:
        try:
            data = rospy.wait_for_message('/r1/front_laser/scan', LaserScan, timeout=0.5)
        except:
            pass
    laser_state = np.array(data.ranges[:])
    laser_state[laser_state == inf] = 10
    laser_state = binning(0, laser_state, 20)

I'm a newer to ROS, I do not understand why two different ways of obtaining the laser_state are needed in def reset and def step. It would be of great help if more explanations could be provided.

reiniscimurs commented 2 years ago

The short answer is, that the 3d Velodyne data environmental representation was built upon the 2d laser data implementation. So the 2d laser data code is still in the program.

But you will notice that in the "step" function the next line: done, col, min_laser = self.calculate_observation(data) actually uses the 2d laser data to detect collisions. The 3d velodyne data is used as the state representation later on. I have left the 2d laser data stuff in there so that if you wish to use only 2d data, with a little bit of coding, the velodyne data could be discarded and just the 2d lidar data used.

For the "reset" function, that's just me being a little lazy. After resetting the robot (after collision, reaching the goal or running out of steps), the neural network needs to have a state as an input to calculate the first action. So a scan from a laser is required. Technically, you could scan the environment with the 3D Velodyne laser and output that in the "reset" function. But the 2d laser scan will also provide very similar data right after a reset, since the laser data is bagged and only the minimal laser value is used in each bag. So I just left the 2d laser scan output as the state representation in the first state after the reset, since it was already there from an older version of the code. This has no real impact on the training process.

AgentEXPL commented 2 years ago

Thanks, that's great. As a newer to ROS, I'm happy to see these examples which teachs me how to use ROS.