` Input 0 of layer "sequential" is incompatible with the layer` using DQN LSTM

aik7 commented 1 year ago

@ fixBroken branch
python exarl/driver/ --env CartPole-v1 --n_episodes 1000 --n_steps 500 --learner_procs 1 --workflow sync --agent DQN-v0 --model_type LSTM
using my local machine

Error message

Traceback (most recent call last):
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "exarl/driver/__main__.py", line 39, in <module>
    exa_learner.run()
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/base/learner_base.py", line 113, in run
    self.workflow.run(self)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/utils/profile.py", line 131, in wrapper_profile
    ret = func(*args, **kwargs)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/workflows/workflow_vault/sync_learner.py", line 767, in run
    self.actor(exalearner, nepisodes)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/utils/introspect.py", line 311, in wrapper
    result = func(*args, **kwargs)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/workflows/workflow_vault/sync_learner.py", line 669, in actor
    action, policy_type = exalearner.agent.action(self.current_state)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/agents/agent_vault/dqn.py", line 105, in action
    q_values = self._forward(observation)
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1147, in autograph_handler
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/agents/models/tf_model.py", line 212, in __call__  *
        ret = model(input, kwargs)
    File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
        raise e.with_traceback(filtered_tb) from None
    File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/keras/engine/input_spec.py", line 264, in assert_input_compatibility
        raise ValueError(f'Input {input_index} of layer "{layer_name}" is '

    ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 1, 4), found shape=(1, 4)

aik7 commented 1 year ago

We need to change the input dimension.

It is not pretty, but the easiest way to fix is that if it is using LSTM, reshape the input. i.e. observation = np.reshape(observation, (observation.shape[0], observation.shape[1]))

aik7 commented 1 year ago

@Jodasue, do we have to use the DQN_v2 class for LSTM in EXARL/exarl/agents/agent_vault/dqn.py?

aik7 commented 1 year ago

If I use --agent DQN-v2 with LSTM, I get another error.

$ python exarl/driver/ --env CartPole-v1 --n_episodes 1000 --n_steps 500 --learner_procs 1 --workflow sync --agent DQN-v2 --model_type LSTM
---------------STARTING REPLACEMENT IB 0 ---------------
Traceback (most recent call last):
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "exarl/driver/__main__.py", line 39, in <module>
    exa_learner.run()
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/base/learner_base.py", line 113, in run
    self.workflow.run(self)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/utils/profile.py", line 131, in wrapper_profile
    ret = func(*args, **kwargs)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/workflows/workflow_vault/sync_learner.py", line 767, in run
    self.actor(exalearner, nepisodes)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/utils/introspect.py", line 311, in wrapper
    result = func(*args, **kwargs)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/workflows/workflow_vault/sync_learner.py", line 669, in actor
    action, policy_type = exalearner.agent.action(self.current_state)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/agents/agent_vault/dqn.py", line 279, in action
    q_values = self._forward(observation)
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1147, in autograph_handler
    raise e.ag_error_metadata.to_exception(e)
TypeError: in user code:

    File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/agents/models/tf_model.py", line 212, in __call__  *
        ret = model(input, kwargs)
    File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
        raise e.with_traceback(filtered_tb) from None
    File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/keras/utils/control_flow_util.py", line 125, in constant_value
        if pred in {0, 1}:  # Accept 1/0 as valid boolean values

    TypeError: Exception encountered when calling layer "batch_normalization" (type BatchNormalization).

    unhashable type: 'dict'

    Call arguments received:
      • inputs=tf.Tensor(shape=(1, 1, 56), dtype=float32)
      • training={}

aik7 commented 1 year ago

Using the trajectory buffer and DQN-v2 for LSTM, we still get the same error.

$ python exarl/driver/ --env CartPole-v1 --n_episodes 100 --n_steps 100 --learner_procs 1 --workflow async --agent DQN-v2 --model_type LSTM --buffer='TrajectoryBuffer' 

---------------STARTING REPLACEMENT IB 0 ---------------
Traceback (most recent call last):
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "exarl/driver/__main__.py", line 39, in <module>
    exa_learner.run()
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/base/learner_base.py", line 113, in run
    self.workflow.run(self)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/utils/profile.py", line 131, in wrapper_profile
    ret = func(*args, **kwargs)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/workflows/workflow_vault/sync_learner.py", line 767, in run
    self.actor(exalearner, nepisodes)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/utils/introspect.py", line 311, in wrapper
    result = func(*args, **kwargs)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/workflows/workflow_vault/sync_learner.py", line 669, in actor
    action, policy_type = exalearner.agent.action(self.current_state)
  File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/agents/agent_vault/dqn.py", line 279, in action
    q_values = self._forward(observation)
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1147, in autograph_handler
    raise e.ag_error_metadata.to_exception(e)
TypeError: in user code:

    File "/home/kagawa/Projects/ExaLearn/EXARL/exarl/agents/models/tf_model.py", line 212, in __call__  *
        ret = model(input, kwargs)
    File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
        raise e.with_traceback(filtered_tb) from None
    File "/home/kagawa/anaconda3/envs/exarl/lib/python3.7/site-packages/keras/utils/control_flow_util.py", line 125, in constant_value
        if pred in {0, 1}:  # Accept 1/0 as valid boolean values

    TypeError: Exception encountered when calling layer "batch_normalization" (type BatchNormalization).

    unhashable type: 'dict'

    Call arguments received:
      • inputs=tf.Tensor(shape=(1, 1, 56), dtype=float32)
      • training={}

aik7 commented 1 year ago

The default buffer is "PrioritizedReplayBuffer" as shown in exarl/learner_cfg.json.
If you choose LSTM, the buffer is trajectory buffer without using --buffer='TrajectoryBuffer'
"You're trying to use a dict as a key to another dict or in a set. That does not work because the keys have to be hashable." Reference

exalearn / EXARL

` Input 0 of layer "sequential" is incompatible with the layer` using DQN LSTM #260