Feature/dreamer v3 - Githubissues

Summary

This PR introduces:

Dreamer-V3 algorithm from https://arxiv.org/abs/2301.04104
RestartOnException environment wrapper, which restarts the environment whenever something wrong happens during step or reset

Type of Change

Please select the one relevant option below:

New feature (non-breaking change that adds functionality)

Checklist

Please confirm that the following tasks have been completed:

[x] I have tested my changes locally and they work as expected. (Please describe the tests you performed.)
[x] I have added unit tests for my changes, or updated existing tests if necessary.
[x] I have updated the documentation, if applicable.
[x] I have installed pre-commit and run locally for my code changes.

Screenshots or Visuals (Optional)

The following image represents the rewards obtained during the interaction with the environment:

The command to replicate is:

lightning run model --precision=32 --devices=1 sheeprl.py dreamer_v3 --total_steps=100000 --learning_starts=1024 --pretrain_steps=1 --train_every=1 --buffer_size=1000000 --memmap_buffer=True --max_episode_steps=108000 --per_rank_batch_size=16 --checkpoint_every=2000 --env_id=MsPacmanNoFrameskip-v4 --seed=5 --cnn_channels_multiplier=32 --dense_units=512 --hidden_size=512 --mlp_layers=2 --recurrent_state_size=512 --checkpoint_buffer=True --action_repeat=4 --cnn_keys rgb --num_envs=1 --capture_video=True --per_rank_sequence_length=64

Thank you for your contribution! Once you have filled out this template, please ensure that you have assigned the appropriate reviewers and that all tests have passed.

Eclectic-Sheep / sheeprl

Feature/dreamer v3 #70

Summary

Type of Change

Checklist

Screenshots or Visuals (Optional)