reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
571 stars 119 forks source link

critic network #7

Closed AgentEXPL closed 2 years ago

AgentEXPL commented 2 years ago

https://github.com/reiniscimurs/DRL-robot-navigation/blob/dea37acfc65f702f7fa792787e09602416cf85d4/TD3/velodyne_td3.py#L76

The above code is used for assembling the features from action and the features from state. Is this trick an normal operation in Actor-Critic framework? Would you like to introduce more about the motivation? Why not use the average of self.layer_2_s(s1) and self.layer_2_a(a)?

reiniscimurs commented 2 years ago

Take a look at the explanation here This proved to work better for me than other methods.