Questions about dreamer v3 config difference between sheeprl and the original repo.

Hello,

Previously, I got a good rl policy in a driving env using the original Dreamer v3 repo:

https://github.com/danijar/dreamerv3

Recently, I've been trying to reproduce it with sheeprl dreamer v3 but the result got a lot worse. So I want to make sure that the config in sheeprl can correspond to the original dreamer config.

replay_ratio & train_ratio In sheeprl, replay_ratio = 0.5 means all the envs will collect 2 transitions before sampling one batch from the buffer to do training, right? In original dreamer v3, train ratio is 32 by default and it is used to calculate kwargs['samples_per_insert'], which is 0.5. Is this samples_per_insert equal to replay_ratio? The reference is the make_repaly() in the main.py:
```
https://github.com/danijar/dreamerv3/blob/main/dreamerv3/main.py
```
buffer.size & replay.size Is the buffer size in sheeprl the same to the replay size in original dreamer v3?
model size In original dreamer v3, I use the model size200m :
```
 dyn.rssm: {deter: 8192, hidden: 1024, classes: 64}
.*\.depth: 64
.*\.units: 1024
```
In sheeprl the size parameters are different, for example dreamer_v3_XL:
```
dense_units: 1024
mlp_layers: 5
world_model:
encoder:
  cnn_channels_multiplier: 96
recurrent_model:
  recurrent_state_size: 4096
transition_model:
  hidden_size: 1024
representation_model:
  hidden_size: 1024
```
May I ask how do they correspond to each other. My current understanding is deter ---> recurrent_state_size, hidden---> hidden_size, depth ---> cnn_channels_multiplier, units ---> dense units, classes ---> ?
fabric.devices This is purely a question for sheeprl fabric.devices. If I have two gpus,shall I set fabric.devices=2? I tried to set devices=2 and the fabric.world_size is still 1, so what does the world_size and devices mean?

Eclectic-Sheep / sheeprl

Questions about dreamer v3 config difference between sheeprl and the original repo. #321