ROBOTIS-GIT / turtlebot3_machine_learning

Apache License 2.0
119 stars 82 forks source link

Question : Go little further #20

Closed ColinJou closed 4 years ago

ColinJou commented 5 years ago

Hello everyone, I have just finished the tutorial about the training of the turtlebot3 in Gazebo, especially on the world 3 ''Mouving Obstacle''. Now I would like to know if it was possible to test the result of this training easily enough, maybe by creating a new world with new obstacle routes and being able to load the last .h5 obtained to test its abilities. I would like to point out that I don't know much about machine learning and the use of all these programs unfortunately. :/ Thank you in advance for your help!

kijongGil commented 5 years ago

Yes, If you have a the .h5 file, you can apply new world with new obstacle. Unfortunately, I can't explain to you about machine learning and the use of all these programs. If you want to create new obstacle, please refer to our gazebo environment.

Thanks, Gilbert.

ColinJou commented 5 years ago

Thank you for your answer! I have tried some improvements regarding the reward formula but none of them have been effective unfortunately (I am in the process of learning machine learning) Now I would like to try to add a 6th action, for example a "stop" that the robot could do instead of turning or going straight ahead. Unfortunately, I couldn't find the place of the code that manages this, except for the "action state" which only manages the graphic part. Is it possible to add a 6th action?

kijongGil commented 5 years ago

Hi, @ColinJou Yes, you can add a 6th action. First, modife action_size to 6. https://github.com/ROBOTIS-GIT/turtlebot3_machine_learning/blob/017741602c4356827eb09ca5caa2c84dc01d74fa/turtlebot3_dqn/nodes/turtlebot3_dqn_stage_1#L152 If you understand reward formula in this code, you will know that the angular velocity and reward are determined by the value of the action. If action value is 0, vel_cmd.angular.z is 1.5. If action value is 4, vel_cmd.angular.z is -1.5. https://github.com/ROBOTIS-GIT/turtlebot3_machine_learning/blob/017741602c4356827eb09ca5caa2c84dc01d74fa/turtlebot3_dqn/src/turtlebot3_dqn/environment_stage_1.py#L122 If you want to 'stop' action, you have to add vel_cmd.linear.x = 0 when action value is stop action. Also, you have to add 'stop' reward. https://github.com/ROBOTIS-GIT/turtlebot3_machine_learning/blob/017741602c4356827eb09ca5caa2c84dc01d74fa/turtlebot3_dqn/src/turtlebot3_dqn/environment_stage_1.py#L92

ColinJou commented 5 years ago

Thank you very much for your help! I finally managed to compile the addition of a 6th action (the results are not yet up to my expectations ^^). I still had two more questions due to my lack of knowledge:

kijongGil commented 5 years ago
  1. The second graph is Q-value. If you want to know the meaning of Q-value, I recommend to read paper or search google. Simply put, the higher Q-value, the good result.
  2. This state is just custom state. You known as, In Atari game, state is image. I created this code by referring to https://github.com/floodsung/DQN-Atari-Tensorflow
JaehyunShim commented 4 years ago

There hasn't been any reply from the questioner so I close this issue