Unable to repro dreamer_v3_100k_ms_pacman experiment reward results

geranim0 commented 4 months ago

Hi,

Ran the default main branch on the dreamer_v3_100k_ms_pacman experiment (seed 5), but could not repro the rewards advertised

Advertised curve:

When I run it locally with defaulted everything: my_pacman100k_run

Wondering what could explain the difference?

Edit: Found out about deterministic mode which is disabled by default. Will update with the deterministic run results once finished

Edit: Finished run:

michele-milesi commented 4 months ago

Hi @brodequin-slaps, thanks for reporting this. In the meantime, I will check that there are no reproducibility problems in our code.

PS. We run that experiment with cfg.torch_deterministic=False

cc: @belerico

belerico commented 4 months ago

Could it be the seed? I've also seen a large variance in Hafner results' on some environments given different seeds. We can maybe try to run another experiment with a different seed?

belerico commented 4 months ago

Also, which SheepRL version or commit are you using?

geranim0 commented 4 months ago

Hi @belerico ,

Using commit e8a68f33dac5684c2dc0659c31ff8999d58659c5

Since using the same seed as upstream, it would make sense to me that the results obtained would match those advertised (especially if ran with cfg.torch_deterministic=True), that way it can give potential library users early confidence that they can repro advertised results. Maybe once my deterministic pacman run finishes (it's about a 2x slowdown - should finish tomorrow), someone could try it too with the same seed (5) to see if it matches.

Will also try different seeds after that

michele-milesi commented 4 months ago

Sure, I try to run other experiments. In the meantime, can you share with us your torch version and the version of your cuda drivers?

Thanks

geranim0 commented 4 months ago

(.venv) sam@sam:~/dev/ml/sheeprl$ python -c "import torch; print(torch.__version__)"
2.2.1+cu121 # Torch

(.venv) sam@sam:~/dev/ml/sheeprl$ nvidia-smi
Wed Mar  6 08:36:26 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.07             Driver Version: 537.34       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1080        On  | 00000000:02:00.0  On |                  N/A |
| 49%   64C    P0             152W / 200W |   7696MiB /  8192MiB |    100%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

belerico commented 4 months ago

Hi @brodequin-slaps, can you try out this branch? You can try to run a reproducible experiment with the following settings:

python sheeprl.py \
exp=dreamer_v3_100k_ms_pacman \
fabric.devices=1 \
fabric.accelerator=cuda \
torch_use_deterministic_algorithms=True \
torch_backends_cudnn_benchmark=False \
torch_backends_cudnn_deterministic=True \
cublas_workspace_config=":4096:8"

where cublas_workspace_config=":4096:8" comes from here, while torch_use_deterministic_algorithms=True, torch_backends_cudnn_benchmark=False and torch_backends_cudnn_deterministic=True comes from the PyTorch reproducibility page.

While I haven't tried specifically with Dreamer-V3, I've run some simple and fast expriments with PPO:

where:

the ${\color{fuchsia}\text{fuchsia}}$ and ${\color{lightskyblue}\text{lightblue}}$ are run without the deterministic features
the ${\color{orange}\text{orange}}$ and ${\color{blue}\text{blue}}$ (which cannot be seen since the orange one perfectly overlaps with it) are run with the deterministic features

P.S. the script has been run with:

python sheeprl.py exp=ppo fabric.devices=1 num_threads=4 algo.mlp_keys.encoder=\[\] algo.cnn_keys.encoder=\["rgb"\] fabric.accelerator=cuda
python sheeprl.py exp=ppo fabric.devices=1 num_threads=4 algo.mlp_keys.encoder=\[\] algo.cnn_keys.encoder=\["rgb"\] fabric.accelerator=cuda torch_use_deterministic_algorithms=True torch_backends_cudnn_benchmark=False torch_backends_cudnn_deterministic=True cublas_workspace_config=":4096:8"

geranim0 commented 3 months ago

Hi @belerico ,

Tried the fix/determinism branch with the deterministic command above, and locally, the runs are deterministic

However, they don't correspond to your experiments, maybe something's different in our setup

belerico commented 3 months ago

Hi @brodequin-slaps, I don't think that one can achieve the perfect determinism on completely different hardware: https://discuss.pytorch.org/t/how-to-get-determistic-behavior-with-different-gpus/125640

Eclectic-Sheep / sheeprl

Unable to repro dreamer_v3_100k_ms_pacman experiment reward results #228