exalearn / EXARL

Scalable Framework for Reinforcement Learning
Other
10 stars 5 forks source link

Problems in running ExaCH env with async workflow #230

Closed fc524079318 closed 2 years ago

fc524079318 commented 2 years ago

I try to run the ExaCh env with async workflow ,I use the command mpiexec -np 4 python exarl/driver/__main__.py --workflow async but there may be some problems with the import path at the line 2 in main.py,it's import exarl.utils.analyze_reward as ar and exarl folder is not under driver so I get a ModuleNotFoundError.Could you please tell me the right import path of exarl? Then I change the leaner_cfg.json to run ExaCH env and I can run the env in random workflow correctly.But when I use async workflow ,I met a tensorflow.python.framework.errors_impl.InvalidArgumentError like

Attempting to load exarl.agents.agent_vault with DQN
Creating ASYNC learner!
Traceback (most recent call last):
  File "start.py", line 24, in <module>
    exa_learner.run()
  File "/home/ai/fc/EXARL/exarl/base/learner_base.py", line 113, in run
    self.workflow.run(self)
  File "/home/ai/fc/EXARL/exarl/workflows/workflow_vault/async_learner.py", line 147, in run
    keep_running = self.actor(workflow, nepisodes)
  File "/home/ai/fc/EXARL/exarl/workflows/workflow_vault/sync_learner.py", line 588, in actor
    batch_data = next(exalearner.agent.generate_data())
  File "/home/ai/fc/EXARL/exarl/agents/agent_vault/dqn.py", line 347, in generate_data
    batch_target = list(map(self.calc_target_f, minibatch))
  File "/home/ai/fc/EXARL/exarl/utils/introspect.py", line 287, in wrapper
    result = func(*args, **kwargs)
  File "/home/ai/fc/EXARL/exarl/agents/agent_vault/dqn.py", line 309, in calc_target_f
    target = tf.add(reward, expectedQ)
  File "/home/ai/.local/lib/python3.8/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/ai/.local/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 7164, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute AddV2 as input #1(zero-based) was expected to be a double tensor but is a float tensor [Op:AddV2]
rvinaybharadwaj commented 2 years ago

I'm not sure if this is related to the environment or the workflow. Are you able to run a simple environment like CartPole-v0?

fc524079318 commented 2 years ago

I can run CartPole-v0 in async workflow and random workflow,and I can run ExaCH-v0 in random flow

rvinaybharadwaj commented 2 years ago

In EXARL/exarl/agents/agent_vault/dqn.py", line 309, try casting the reward to a double.

target = tf.add(tf.cast(reward, tf.float64), expectedQ)
rvinaybharadwaj commented 2 years ago

Let me know if this fixes the issue. I'll make changes to the code to force cast everything.

fc524079318 commented 2 years ago

I still met the error after changing the reward to a double in dqn.py line 309 like

  File "/home/ai/fc/EXARL/exarl/agents/agent_vault/dqn.py", line 309, in calc_target_f
    target = tf.add(tf.cast(reward,tf.float64),expectedQ)
  File "/home/ai/.local/lib/python3.8/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/ai/.local/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 7164, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute AddV2 as input #1(zero-based) was expected to be a double tensor but is a float tensor [Op:AddV2]
rvinaybharadwaj commented 2 years ago

Let me check and get back to you.

rvinaybharadwaj commented 2 years ago

Fixed in https://github.com/exalearn/EXARL/pull/231