Question: about training batch setting

nanli42 commented 5 years ago

Hello,

Firstly thanks for your excellent example of DQN on TurtleBot3!

But it seems that there are some problems about the training batch setting in function def trainModel() from "_turtlebot3_machine_learning/turtlebot3_dqn/nodes/turtlebot3_dqnstage*_":

X_batch = np.append(X_batch, np.array([states.copy()]), axis=0)

Y_sample = q_value.copy()
Y_sample[0][actions] = next_q_value
Y_batch = np.append(Y_batch, np.array([Y_sample[0]]), axis=0)

if dones:
    X_batch = np.append(X_batch, np.array([next_states.copy()]), axis=0)
    Y_batch = np.append(Y_batch, np.array([[rewards] * self.action_size]), axis=0)

If some action is done, it means that it reaches the goal or the obstacle, or it get the TIMEOUT. Thus it seems useless to save the next state. The rewards have been distinguished regard to done or not in the function def getQvalue(). So I think it is better to remove the code if dones:.... Otherwise, a consequence could be: when we fit the model in

self.model.fit(X_batch, Y_batch, batch_size=self.batch_size, epochs=1, verbose=0)

the size of X_batch and Y_batch could exceed the batch_size.

Maybe I am wrong, then please let me know! Thanks in advance! :) Nan

kijongGil commented 5 years ago

Hi @nanli42, Thanks for your contribution. I agree with your opinion. But, I'm busy due to other projects. I will test it and tell you. Thank you for your interest.

Gilbert.

JaehyunShim commented 4 years ago

@nanli42

Thank you for your comments. We will take them into consideration when we release the next update (ROS 2 Dashing Diademata)

Thank you very much, Ryan

ROBOTIS-GIT / turtlebot3_machine_learning

Question: about training batch setting #23