ROBOTIS-GIT / turtlebot3_machine_learning

Apache License 2.0
119 stars 82 forks source link

Question: about training batch setting #23

Closed nanli42 closed 4 years ago

nanli42 commented 5 years ago

Hello,

Firstly thanks for your excellent example of DQN on TurtleBot3!

But it seems that there are some problems about the training batch setting in function def trainModel() from "_turtlebot3_machine_learning/turtlebot3_dqn/nodes/turtlebot3_dqnstage*_":

X_batch = np.append(X_batch, np.array([states.copy()]), axis=0)

Y_sample = q_value.copy()
Y_sample[0][actions] = next_q_value
Y_batch = np.append(Y_batch, np.array([Y_sample[0]]), axis=0)

if dones:
    X_batch = np.append(X_batch, np.array([next_states.copy()]), axis=0)
    Y_batch = np.append(Y_batch, np.array([[rewards] * self.action_size]), axis=0)

If some action is done, it means that it reaches the goal or the obstacle, or it get the TIMEOUT. Thus it seems useless to save the next state. The rewards have been distinguished regard to done or not in the function def getQvalue(). So I think it is better to remove the code if dones:.... Otherwise, a consequence could be: when we fit the model in

self.model.fit(X_batch, Y_batch, batch_size=self.batch_size, epochs=1, verbose=0)

the size of X_batch and Y_batch could exceed the batch_size.

Maybe I am wrong, then please let me know! Thanks in advance! :) Nan

kijongGil commented 5 years ago

Hi @nanli42, Thanks for your contribution. I agree with your opinion. But, I'm busy due to other projects. I will test it and tell you. Thank you for your interest.

Gilbert.

JaehyunShim commented 4 years ago

@nanli42

Thank you for your comments. We will take them into consideration when we release the next update (ROS 2 Dashing Diademata)

Thank you very much, Ryan