[rllib] AttributeError: 'list' object has no attribute 'float', when using dreamer

Sangboom commented 3 years ago

What is the problem?

Ray version: 2.0.0.dev0 Python version: 3.8.5 OS: Ubuntu 20.04 Pytorch: 1.7.1

I'm the one who opened the issue about importing DREAMERTrainer (https://github.com/ray-project/ray/issues/13551#issue-788966521), and I have a problem with using Dreamer now. After fixing the DREAMERTrainer import error, I tried to use dreamer on my custom environment. However, it didn't work. Thus, I test with ray-project / rl-experiments repository's dreamer. But the same error (AttributeError: 'list' object has no attribute 'float') occurs. I want to know whether this is the module's problem, and if not, I would like to get an example code on how to use it.

Thank you

Reproduction (REQUIRED)

rllib train -f dreamer/dreamer-deepmind-control.yaml

Traceback (most recent call last): File "/home/sangbeom/ray/python/ray/tune/trial_runner.py", line 678, in _process_trial results = self.trial_executor.fetch_result(trial) File "/home/sangbeom/ray/python/ray/tune/ray_trial_executor.py", line 610, in fetch_result result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT) File "/home/sangbeom/ray/python/ray/_private/client_mode_hook.py", line 47, in wrapper return func(*args, *kwargs) File "/home/sangbeom/ray/python/ray/worker.py", line 1458, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::Dreamer.train_buffered() (pid=114452, ip=172.27.183.141) File "/home/sangbeom/ray/python/ray/rllib/utils/threading.py", line 21, in wrapper return func(self, a, **k) File "/home/sangbeom/ray/python/ray/rllib/policy/torch_policy.py", line 281, in _compute_action_helper torch.exp(logp.float()) AttributeError: 'list' object has no attribute 'float'

During handling of the above exception, another exception occurred:

ray::Dreamer.train_buffered() (pid=114452, ip=172.27.183.141) File "python/ray/_raylet.pyx", line 439, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 473, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 476, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 480, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 432, in ray._raylet.execute_task.function_executor File "/home/sangbeom/ray/python/ray/rllib/agents/trainer_template.py", line 107, in init Trainer.init(self, config, env, logger_creator) File "/home/sangbeom/ray/python/ray/rllib/agents/trainer.py", line 486, in init super().init(config, logger_creator) File "/home/sangbeom/ray/python/ray/tune/trainable.py", line 97, in init self.setup(copy.deepcopy(self.config)) File "/home/sangbeom/ray/python/ray/rllib/agents/trainer.py", line 654, in setup self._init(self.config, self.env_creator) File "/home/sangbeom/ray/python/ray/rllib/agents/trainer_template.py", line 134, in _init self.workers = self._make_workers( File "/home/sangbeom/ray/python/ray/rllib/agents/trainer.py", line 725, in _make_workers return WorkerSet( File "/home/sangbeom/ray/python/ray/rllib/evaluation/worker_set.py", line 90, in init self._local_worker = self._make_worker( File "/home/sangbeom/ray/python/ray/rllib/evaluation/worker_set.py", line 321, in _make_worker worker = cls( File "/home/sangbeom/ray/python/ray/rllib/evaluation/rollout_worker.py", line 479, in init self.policy_map, self.preprocessors = self._build_policy_map( File "/home/sangbeom/ray/python/ray/rllib/evaluation/rollout_worker.py", line 1111, in _build_policy_map policy_map[name] = cls(obs_space, act_space, merged_conf) File "/home/sangbeom/ray/python/ray/rllib/policy/policy_template.py", line 266, in init self._initialize_loss_from_dummy_batch( File "/home/sangbeom/ray/python/ray/rllib/policy/policy.py", line 622, in _initialize_loss_from_dummy_batch self.compute_actions_from_input_dict(input_dict, explore=False) File "/home/sangbeom/ray/python/ray/rllib/policy/torch_policy.py", line 207, in compute_actions_from_input_dict return self._compute_action_helper(input_dict, state_batches, File "/home/sangbeom/ray/python/ray/rllib/utils/threading.py", line 23, in wrapper raise AttributeError( AttributeError: Object <ray.rllib.policy.policy_template.DreamerTorchPolicy object at 0x7efdd1c7e730> must have a self._lock property (assigned to a threading.Lock() object in its constructor)!

ian-cannon commented 3 years ago

I am seeing the same issue with a different system configuration

What is the problem?

Ray version: 1.2.0 Python version: 3.7.10 OS: Ubuntu 18.04 Pytorch: 1.7.0

Reproduction (REQUIRED)

rllib train -f dreamer/dreamer-deepmind-control.yaml

AttributeError: Object <ray.rllib.policy.policy_template.DreamerTorchPolicy object at 0x7f87d82a0b50> must have a `self._lock` property (assigned to a threading.Lock() object in its constructor)!

kifarid commented 3 years ago

same error

Ray version: 1.2.0 Python version: 3.8.8 OS: macOS Big Sur 11.2.3 11.2.3 Pytorch: 1.8.0

Reproduction (REQUIRED)

rllib train -f dreamer/dreamer-deepmind-control.yaml

2021-03-11 21:13:48,422 ERROR trial_runner.py:616 -- Trial DREAMER_ray.rllib.examples.env.dm_control_suite.cheetah_run_dc967_00001: Error processing event. Traceback (most recent call last): File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 586, in _process_trial results = self.trial_executor.fetch_result(trial) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 609, in fetch_result result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper return func(*args, *kwargs) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/worker.py", line 1456, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::Dreamer.train_buffered() (pid=21363, ip=192.168.1.5) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/utils/threading.py", line 21, in wrapper return func(self, a, **k) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/policy/torch_policy.py", line 281, in _compute_action_helper torch.exp(logp.float()) AttributeError: 'list' object has no attribute 'float'

During handling of the above exception, another exception occurred:

ray::Dreamer.train_buffered() (pid=21363, ip=192.168.1.5) File "python/ray/_raylet.pyx", line 439, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 473, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 476, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 480, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 432, in ray._raylet.execute_task.function_executor File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 107, in init Trainer.init(self, config, env, logger_creator) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 486, in init super().init(config, logger_creator) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/tune/trainable.py", line 97, in init self.setup(copy.deepcopy(self.config)) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 654, in setup self._init(self.config, self.env_creator) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 134, in _init self.workers = self._make_workers( File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 725, in _make_workers return WorkerSet( File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 90, in init self._local_worker = self._make_worker( File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 321, in _make_worker worker = cls( File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 479, in init self.policy_map, self.preprocessors = self._build_policy_map( File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1111, in _build_policy_map policy_map[name] = cls(obs_space, act_space, merged_conf) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/policy/policy_template.py", line 266, in init self._initialize_loss_from_dummy_batch( File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/policy/policy.py", line 622, in _initialize_loss_from_dummy_batch self.compute_actions_from_input_dict(input_dict, explore=False) File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/policy/torch_policy.py", line 207, in compute_actions_from_input_dict return self._compute_action_helper(input_dict, state_batches, File "/Users/kfarid/.conda/envs/hyperdreamer/lib/python3.8/site-packages/ray/rllib/utils/threading.py", line 23, in wrapper raise AttributeError( AttributeError: Object <ray.rllib.policy.policy_template.DreamerTorchPolicy object at 0x7f9bffb83c70> must have a self._lock property (assigned to a threading.Lock() object in its constructor)!

ian-cannon commented 3 years ago

I have a fix for this. It seems that logp is being set at a list in dreamer_torch_policy.py. By setting logp to a tensor, it now has the .float() property that torch_policy is looking for.

logp = torch.tensor([0.0])

I can submit a PR for this.

Sangboom commented 3 years ago

I've tried similar things that changing logp to tensor before. Then .float() problem (Attribute error) solved. However, that seems to cause problems when using other algorithms like DDPG? isn't it?

ian-cannon commented 3 years ago

It actually causes more problems without changing the algorithm. It looks like the dimensions of the state and action space are expected to be different in the Model's observe function.

embed = embed.permute(1, 0, 2)
action = action.permute(1, 0, 2)

when both have only 2 dimensions. Changing this to permute(1,0) allows it to continue for a time, but does not remedy this problem either as it tries to cat prev_state[2] with prev action when they have a different number of dimensions as well. I think something is messed up farther up the pipe.

kifarid commented 3 years ago

I guess the problem here is with setting the ViewRequirement object in the policy as it does not have any indication that the dreamer model requires a time axis in observations, actions, etc.

The call to the self._initialize_loss_from_dummy_batch function creates a dummy batch according to the ViewRequirementobject, which defines the exact conditions by which this column should be populated with data.

EloyAnguiano commented 3 years ago

Still having the same issue. Is there any version with the fixed Dreamer algorithm?

ray-project / ray