inarikami / keras-rl2

Reinforcement learning with tensorflow 2 keras
MIT License
250 stars 103 forks source link

Did someone know why DQN agent add one shape to array? #7

Open TheTrash opened 4 years ago

TheTrash commented 4 years ago

So this is my setting:

keras==2.3.1 keras-rl2 tensorflow-gpu==2.0.0-beta1 numpy==1.16.4

And when i fit the DQN agent, my model go on full craziness

#Build the NN
model = Sequential()
model.add(Convolution2D(32, (8, 8), strides=(4, 4),input_shape=input_shape,activation='relu'))
model.add(Convolution2D(64, (4, 4), strides=(2, 2), activation='relu'))
model.add(Convolution2D(64, (2, 2), strides=(1, 1), activation='relu'))
model.add(Flatten())
model.add(Dense(512,activation='relu'))
model.add(Dense(nb_actions, activation='softmax'))
print(model.summary())

policy = BoltzmannGumbelQPolicy()
memory= SequentialMemory(limit=50000, window_length = 1)
dqn = DQNAgent(model = model, nb_actions = nb_actions , memory=memory, train_interval=1,
               nb_steps_warmup=5000, target_model_update=10000, policy=policy)

dqn.compile(Adam(lr=1e-4), metrics=['mae'])

This is the error :

ValueError: Error when checking input: expected conv2d_input to have 4 dimensions, but got array with shape (1, 1, 240, 256, 3) 

But it start here:

/usr/local/lib/python3.6/dist-packages/rl/core.py in fit(self, env, nb_steps, action_repetition, callbacks, verbose, visualize, nb_max_start_steps, start_step_policy, log_interval, nb_max_episode_steps)

row #169 core.py

I'm quite new in this library, maybe i'm missing something

mmansky-3 commented 4 years ago

Related to #11 , exhibits the same behaviour of adding a dimension. For a quick fix, add a squeeze layer via keras.backend.squeeze

TheTrash commented 4 years ago

I appreciate your response and your explaination. Now, i've apparently solved my issued searching on keras-rl( not 2 ) git some soluction in various issues.

And here is my ( for now ) solution #229 I hope this can be quite helpful

TheTrash commented 4 years ago

For the next people that will have this issue this is how i've changed the method in dqn.py ( and i suppose in the other agent of this library )

#on dqn.py
def process_state_batch(self, batch):
      #batch = numpy.array(batch)
       #print(batch)
       if self.processor is None:
           #return numpy.squeeze(batch, axis=1)
           return batch
        return self.processor.process_state_batch(batch)

I suppose the error is in the numpy.array that add the extra dimension for some reasons. But, as you can see, i'm not using the processor so i'm not sure this can be valid for the processor's method.

vachanda commented 4 years ago

After some searching, I found the issue in the function compute_q_values In the function, the state value is being forcefully put in another array.

    def compute_q_values(self, state):
        q_values = self.compute_batch_q_values([state]).flatten()
        assert q_values.shape == (self.nb_actions,)
        return q_values

After removing the explicit list conversion. It works fine for me.