The CQL documentation lists that it supports "RNN, LSTM auto-wrapping, and autoreg" but its trainer is a customization of SAC which does not support these features.
(pid=3649625) 2021-04-13 14:30:35,046 INFO trainer.py:703 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=3649625) 2021-04-13 14:30:35,053 ERROR worker.py:395 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::CQL.__init__() (pid=3649625, ip=192.168.1.216)
(pid=3649625) File "python/ray/_raylet.pyx", line 505, in ray._raylet.execute_task
(pid=3649625) File "python/ray/_raylet.pyx", line 449, in ray._raylet.execute_task.function_executor
(pid=3649625) File "/path/to/ray/_private/function_manager.py", line 566, in actor_method_executor
(pid=3649625) return method(__ray_actor, *args, **kwargs)
(pid=3649625) File "/path/to/ray/rllib/agents/trainer_template.py", line 122, in __init__
(pid=3649625) Trainer.__init__(self, config, env, logger_creator)
(pid=3649625) File "/path/to/ray/rllib/agents/trainer.py", line 523, in __init__
(pid=3649625) super().__init__(config, logger_creator)
(pid=3649625) File "/path/to/ray/tune/trainable.py", line 98, in __init__
(pid=3649625) self.setup(copy.deepcopy(self.config))
(pid=3649625) File "/path/to/ray/rllib/agents/trainer.py", line 714, in setup
(pid=3649625) self._init(self.config, self.env_creator)
(pid=3649625) File "/path/to/ray/rllib/agents/trainer_template.py", line 154, in _init
(pid=3649625) num_workers=self.config["num_workers"])
(pid=3649625) File "/path/to/ray/rllib/agents/trainer.py", line 796, in _make_workers
(pid=3649625) logdir=self.logdir)
(pid=3649625) File "/path/to/ray/rllib/evaluation/worker_set.py", line 98, in __init__
(pid=3649625) spaces=spaces,
(pid=3649625) File "/path/to/ray/rllib/evaluation/worker_set.py", line 357, in _make_worker
(pid=3649625) spaces=spaces,
(pid=3649625) File "/path/to/ray/rllib/evaluation/rollout_worker.py", line 517, in __init__
(pid=3649625) policy_dict, policy_config)
(pid=3649625) File "/path/to/ray/rllib/evaluation/rollout_worker.py", line 1158, in _build_policy_map
(pid=3649625) policy_map[name] = cls(obs_space, act_space, merged_conf)
(pid=3649625) File "/path/to/ray/rllib/policy/policy_template.py", line 224, in __init__
(pid=3649625) self, obs_space, action_space, config)
(pid=3649625) File "/path/to/ray/rllib/agents/sac/sac_torch_policy.py", line 77, in build_sac_model_and_action_dist
(pid=3649625) model = build_sac_model(policy, obs_space, action_space, config)
(pid=3649625) File "/path/to/ray/rllib/agents/sac/sac_tf_policy.py", line 84, in build_sac_model
(pid=3649625) target_entropy=config["target_entropy"])
(pid=3649625) File "/path/to/ray/rllib/models/catalog.py", line 581, in get_model_v2
(pid=3649625) name, **model_kwargs)
(pid=3649625) TypeError: __init__() got an unexpected keyword argument 'policy_model_config'
If the code snippet cannot be run by itself, the issue will be closed with "needs-repro-script".
[x] I have verified my script runs in a clean environment and reproduces the issue.
[x ] I have verified the issue also occurs with the latest wheels.
What is the problem?
Ray: Nightly
The CQL documentation lists that it supports "RNN, LSTM auto-wrapping, and autoreg" but its trainer is a customization of SAC which does not support these features.
Reproduction (REQUIRED)
Error:
If the code snippet cannot be run by itself, the issue will be closed with "needs-repro-script".