Eclectic-Sheep / sheeprl

Distributed Reinforcement Learning accelerated by Lightning Fabric
https://eclecticsheep.ai
Apache License 2.0
300 stars 29 forks source link

Can't run Dreamer algos #66

Closed dtch1997 closed 1 year ago

dtch1997 commented 1 year ago

Thanks for making this great repo.

I'm trying to run Dreamer-v2 and I can't get it to work.

Steps to reproduce:

System details:

Running PPO works.

Running Dreamer-v1 fails.

$ python sheeprl.py dreamer_v1 --env_id=CartPole-v1 --num_envs=1 --buffer_size=2 --per_rank_batch_size=1 --per_rank_sequence_length=1
/home/daniel/Documents/github/sheeprl/sheeprl/cli.py:23: UserWarning: This script was launched without the Lightning CLI. Consider to launch the script with `lightning run model ...` to scale it with Fabric
  warnings.warn(
You are using a CUDA device ('NVIDIA GeForce RTX 3070 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
Global seed set to 42
Missing logger folder: logs/dreamer_v1/2023-08-02_08-11-14/CartPole-v1_default_42_1690960274
/home/daniel/Documents/github/sheeprl/sheeprl/algos/dreamer_v1/dreamer_v1.py:584: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:245.)
  rewards = torch.tensor([rewards]).view(args.num_envs, -1).float()
Traceback (most recent call last):
  File "/home/daniel/Documents/github/sheeprl/sheeprl.py", line 4, in <module>
    run()
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/daniel/Documents/github/sheeprl/sheeprl/cli.py", line 75, in wrapper
    command()
  File "/home/daniel/Documents/github/sheeprl/sheeprl/algos/dreamer_v1/dreamer_v1.py", line 594, in main
    rb.add(step_data[None, ...])
  File "/home/daniel/Documents/github/sheeprl/sheeprl/data/buffers.py", line 647, in add
    self._buf[env_idx].add(data[:, env_data_idx : env_data_idx + 1])
  File "/home/daniel/Documents/github/sheeprl/sheeprl/data/buffers.py", line 146, in add
    self._buf[idxes, :] = data_to_store
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/tensordict/tensordict.py", line 3133, in __setitem__
    self.set_at_(key, item, index)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/tensordict/tensordict.py", line 3808, in set_at_
    _set_item(tensor_in, idx, value)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/tensordict/utils.py", line 730, in _set_item
    tensor[index] = value
IndexError: tensors used as indices must be long, int, byte or bool tensors

Running Dreamer-v2 also fails.

$ python sheeprl.py dreamer_v2 --env_id=CartPole-v1 --num_envs=1
/home/daniel/Documents/github/sheeprl/sheeprl/cli.py:23: UserWarning: This script was launched without the Lightning CLI. Consider to launch the script with `lightning run model ...` to scale it with Fabric
  warnings.warn(
You are using a CUDA device ('NVIDIA GeForce RTX 3070 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
Global seed set to 42
Missing logger folder: logs/dreamer_v2/2023-08-02_08-09-04/CartPole-v1_default_42_1690960144
Traceback (most recent call last):
  File "/home/daniel/Documents/github/sheeprl/sheeprl.py", line 4, in <module>
    run()
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/daniel/Documents/github/sheeprl/sheeprl/cli.py", line 75, in wrapper
    command()
  File "/home/daniel/Documents/github/sheeprl/sheeprl/algos/dreamer_v2/dreamer_v2.py", line 486, in main
    raise RuntimeError(f"There must be at least one valid observation.")
RuntimeError: There must be at least one valid observation

Running the Dreamer-v1 tests works. (The first 6 at least, I didn't wait for the others to finish.)

$ python -m pytest tests/test_algos -k test_dreamer_v1
======================================== test session starts ========================================
platform linux -- Python 3.9.17, pytest-7.3.1, pluggy-1.2.0
rootdir: /home/daniel/Documents/github/sheeprl
configfile: pyproject.toml
plugins: anyio-3.7.1, timeout-2.1.0, cov-4.1.0
collected 117 items / 99 deselected / 18 selected                                                   

tests/test_algos/test_algos.py ......

Running the Dreamer-v2 tests also works. (Again, only the first 6)

$ python -m pytest tests/test_algos/test_algos.py -k test_dreamer_v2 
======================================== test session starts ========================================
platform linux -- Python 3.9.17, pytest-7.3.1, pluggy-1.2.0
rootdir: /home/daniel/Documents/github/sheeprl
configfile: pyproject.toml
plugins: anyio-3.7.1, timeout-2.1.0, cov-4.1.0
collected 114 items / 96 deselected / 18 selected                                                   

tests/test_algos/test_algos.py ......
dtch1997 commented 1 year ago

After some further testing, I also can't run many of the other algorithms.

PPO continuous:

$ python sheeprl.py ppo_continuous --num_envs=1 --env_id=Walker2d-v4
/home/daniel/Documents/github/sheeprl/sheeprl/cli.py:23: UserWarning: This script was launched without the Lightning CLI. Consider to launch the script with `lightning run model ...` to scale it with Fabric
  warnings.warn(
You are using a CUDA device ('NVIDIA GeForce RTX 3070 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
Global seed set to 42
Missing logger folder: logs/ppo_continuous/2023-08-02_08-25-25/Walker2d-v4_default_42_1690961125
Traceback (most recent call last):
  File "/home/daniel/Documents/github/sheeprl/sheeprl.py", line 4, in <module>
    run()
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/daniel/Documents/github/sheeprl/sheeprl/cli.py", line 75, in wrapper
    command()
  File "/home/daniel/Documents/github/sheeprl/sheeprl/algos/ppo_continuous/ppo_continuous.py", line 228, in main
    action, logprob, _ = actor.module(next_obs)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/daniel/Documents/github/sheeprl/sheeprl/algos/ppo_continuous/agent.py", line 41, in forward
    x = self.model(obs)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/daniel/Documents/github/sheeprl/sheeprl/models/models.py", line 118, in forward
    return self.model(obs)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype

SAC:

$ python sheeprl.py sac --num_envs 1 --env_id Walker2d-v4
/home/daniel/Documents/github/sheeprl/sheeprl/cli.py:23: UserWarning: This script was launched without the Lightning CLI. Consider to launch the script with `lightning run model ...` to scale it with Fabric
  warnings.warn(
You are using a CUDA device ('NVIDIA GeForce RTX 3070 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
Global seed set to 42
Missing logger folder: logs/sac/2023-08-02_08-24-39/Walker2d-v4_default_42_1690961079
/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/torchmetrics/utilities/prints.py:42: UserWarning: The ``compute`` method of metric MeanMetric was called before the ``update`` method which may lead to errors, as metric states have not yet been updated.
  warnings.warn(*args, **kwargs)  # noqa: B028
Traceback (most recent call last):
  File "/home/daniel/Documents/github/sheeprl/sheeprl.py", line 4, in <module>
    run()
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/daniel/Documents/github/sheeprl/sheeprl/cli.py", line 75, in wrapper
    command()
  File "/home/daniel/Documents/github/sheeprl/sheeprl/algos/sac/sac.py", line 251, in main
    rb.add(step_data.unsqueeze(0))
  File "/home/daniel/Documents/github/sheeprl/sheeprl/data/buffers.py", line 146, in add
    self._buf[idxes, :] = data_to_store
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/tensordict/tensordict.py", line 3133, in __setitem__
    self.set_at_(key, item, index)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/tensordict/tensordict.py", line 3808, in set_at_
    _set_item(tensor_in, idx, value)
  File "/home/daniel/anaconda3/envs/sheeprl/lib/python3.9/site-packages/tensordict/utils.py", line 730, in _set_item
    tensor[index] = value
RuntimeError: Index put requires the source and destination dtypes match, got Float for the destination and Double for the source.
belerico commented 1 year ago

Hi @dtch1997, i will look into them.

belerico commented 1 year ago

Hi @dtch1997, could you please try out this branch, which follows #65. In that branch, which will be merged soon, we have introduced the possibility for almost every algorithm to encode both images and vector coming from the environment, so you have to specify either --cnn_keys rgb, --mlp_keys state or both, for more information on how to select the correct keys you can have a look at https://github.com/Eclectic-Sheep/sheeprl/blob/feature/diambra/howto/select_observations.md