From reinforcement-learning/2-cartpole/1-dqn/cartpole_dqn.py/train_model
def train_model(self):
if len(self.memory) < self.train_start:
return
batch_size = min(self.batch_size, len(self.memory))
mini_batch = random.sample(self.memory, batch_size)
update_input = np.zeros((batch_size, self.state_size))
update_target = np.zeros((batch_size, self.state_size))
action, reward, done = [], [], []
for i in range(self.batch_size):
update_input[i] = mini_batch[i][0]
action.append(mini_batch[i][1])
reward.append(mini_batch[i][2])
update_target[i] = mini_batch[i][3]
done.append(mini_batch[i][4])
target = self.model.predict(update_input)
target_val = self.target_model.predict(update_target)
for i in range(self.batch_size):
# Q Learning: get maximum Q value at s' from target model
if done[i]:
target[i][action[i]] = reward[i]
else:
target[i][action[i]] = reward[i] + self.discount_factor * (
np.amax(target_val[i]))
# and do the model fit!
self.model.fit(update_input, target, batch_size=self.batch_size,
epochs=1, verbose=0)
In the this part of code, why you use self.batch_size after take the minimum value between self.batch_size and the length of memory? Would batch_size be better?
From reinforcement-learning/2-cartpole/1-dqn/cartpole_dqn.py/train_model
In the this part of code, why you use self.batch_size after take the minimum value between self.batch_size and the length of memory? Would batch_size be better?