Eclectic-Sheep / sheeprl

Distributed Reinforcement Learning accelerated by Lightning Fabric
https://eclecticsheep.ai
Apache License 2.0
274 stars 26 forks source link

`torch.compile` agents #261

Open belerico opened 2 months ago

belerico commented 2 months ago

Hi everyone, in this branch one can use torch.compile to compile the Dreamer-V3 agent. In particular:

Those are the results I've obtained:

image

The command I run is:

python sheeprl.py exp=dreamer_v3 \                   
env=gym env.id=CartPole-v1 \
env.num_envs=4 \
fabric.accelerator=gpu \
fabric.precision=bf16-mixed \
algo=dreamer_v3_S \
algo.learning_starts=1024 \
algo.cnn_keys.encoder=\[\] \
algo.mlp_keys.encoder=\["vector"\] \
algo.cnn_keys.decoder=\[\] \
algo.mlp_keys.decoder=\["vector"\] \
algo.per_rank_sequence_length=64 \
algo.replay_ratio=0.5 \
algo.world_model.decoupled_rssm=False \
algo.world_model.learnable_initial_recurrent_state=False

My env is:

python -m torch.utils.collect_env
<frozen runpy>:128: RuntimeWarning: 'torch.utils.collect_env' found in sys.modules after import of package 'torch.utils', but prior to execution of 'torch.utils.collect_env'; this may result in unpredictable behaviour
Collecting environment information...
PyTorch version: 2.3.0.dev20240314+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Pop!_OS 22.04 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.11.8 (main, Feb 26 2024, 21:39:34) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-6.8.0-76060800daily20240311-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4060 Laptop GPU
Nvidia driver version: 550.67
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      39 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             16
On-line CPU(s) list:                0-15
Vendor ID:                          GenuineIntel
Model name:                         13th Gen Intel(R) Core(TM) i5-13500H
CPU family:                         6
Model:                              186
Thread(s) per core:                 2
Core(s) per socket:                 12
Socket(s):                          1
Stepping:                           2
CPU max MHz:                        4700.0000
CPU min MHz:                        400.0000
BogoMIPS:                           6374.40
Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect user_shstk avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize arch_lbr ibt flush_l1d arch_capabilities
Virtualization:                     VT-x
L1d cache:                          448 KiB (12 instances)
L1i cache:                          640 KiB (12 instances)
L2 cache:                           9 MiB (6 instances)
L3 cache:                           18 MiB (1 instance)
NUMA node(s):                       1
NUMA node0 CPU(s):                  0-15
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit:        Not affected
Vulnerability L1tf:                 Not affected
Vulnerability Mds:                  Not affected
Vulnerability Meltdown:             Not affected
Vulnerability Mmio stale data:      Not affected
Vulnerability Retbleed:             Not affected
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:           Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Vulnerability Srbds:                Not affected
Vulnerability Tsx async abort:      Not affected

Versions of relevant libraries:
[pip3] mypy==1.2.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] pytorch-lightning==2.2.1
[pip3] pytorch-triton==3.0.0+989adb9a29
[pip3] torch==2.3.0.dev20240314+cu121
[pip3] torchmetrics==1.3.2
[pip3] torchvision==0.18.0.dev20240314+cu121
[pip3] triton==2.2.0
[conda] numpy                     1.26.4                   pypi_0    pypi
[conda] pytorch-lightning         2.2.1                    pypi_0    pypi
[conda] pytorch-triton            3.0.0+989adb9a29          pypi_0    pypi
[conda] torch                     2.3.0.dev20240314+cu121          pypi_0    pypi
[conda] torchmetrics              1.3.2                    pypi_0    pypi
[conda] torchvision               0.18.0.dev20240314+cu121          pypi_0    pypi
[conda] triton                    2.2.0                    pypi_0    pypi

Everyone is welcome to test it out and run some experiments with Dreamer-V3 or any other algorithm (taking inspiration from the Dreamer-V3 agent).

We can keep this issue as a reference.

Thank you all!

belerico commented 2 months ago

I've run an experiment on a Lightning Studio with an A10G GPU with the following command:

python sheeprl.py exp=dreamer_v3_100k_ms_pacman fabric.devices=1 fabric.precision=32 fabric.accelerator=gpu

It has run in less than 6 hours and I've obtained the following results:

image

The test on different seeds are:

seed 5: 1970
seed 1024: 1910
seed 42: 1760
seed 1337: 1680
seed 8: 1270
seed 2: 1310

with a 1650 +- 271.72 average reward.

This is the graph reported by Hafner et al.

image

michele-milesi commented 2 months ago

I have run a walker walk training with the commit: d640a41cc9a959e28e8f052376fc5e808ee0da1d (ReLU as activation function). These are the configs that have been used for training, with the only difference being that you use decoupled RSSM. The command used is the following:

python sheeprl.py exp=dreamer_v3_dmc_walker_walk algo.world_model.decoupled_rssm=True

Below the obtained results are reported:

ww-compiled

The green line is the experiment run with the compile (~26h 34min). The red and blue lines were run without compile (~38h 40min).

My env is:

$ python -m torch.utils.collect_env

Collecting environment information...
PyTorch version: 2.3.0.dev20240314+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.2 LTS (x86_64)
GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
Clang version: Could not collect
CMake version: version 3.27.7
Libc version: glibc-2.35

Python version: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.19.0-46-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.5.119
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3080
Nvidia driver version: 535.54.03
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Address sizes:                   39 bits physical, 48 bits virtual
Byte Order:                      Little Endian
CPU(s):                          12
On-line CPU(s) list:             0-11
Vendor ID:                       GenuineIntel
Model name:                      Intel(R) Core(TM) i7-8700T CPU @ 2.40GHz
CPU family:                      6
Model:                           158
Thread(s) per core:              2
Core(s) per socket:              6
Socket(s):                       1
Stepping:                        10
CPU max MHz:                     2400,0000
CPU min MHz:                     800,0000
BogoMIPS:                        4800.00
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
Virtualization:                  VT-x
L1d cache:                       192 KiB (6 instances)
L1i cache:                       192 KiB (6 instances)
L2 cache:                        1,5 MiB (6 instances)
L3 cache:                        12 MiB (1 instance)
NUMA node(s):                    1
NUMA node0 CPU(s):               0-11
Vulnerability Itlb multihit:     KVM: Mitigation: VMX disabled
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Mmio stale data:   Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed:          Mitigation; IBRS
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Mitigation; Microcode
Vulnerability Tsx async abort:   Mitigation; TSX disabled

Versions of relevant libraries:
[pip3] mypy==1.2.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.3
[pip3] pytorch-lightning==2.1.3
[pip3] pytorch-triton==3.0.0+989adb9a29
[pip3] torch==2.3.0.dev20240314+cu121
[pip3] torch-tb-profiler==0.4.3
[pip3] torchmetrics==1.3.0
[pip3] torchvision==0.18.0.dev20240314+cu121
[pip3] triton==2.2.0
[conda] numpy                     1.26.3                   pypi_0    pypi
[conda] pytorch-lightning         2.1.3                    pypi_0    pypi
[conda] pytorch-triton            3.0.0+989adb9a29          pypi_0    pypi
[conda] torch                     2.3.0.dev20240314+cu121          pypi_0    pypi
[conda] torch-tb-profiler         0.4.3                    pypi_0    pypi
[conda] torchmetrics              1.3.0                    pypi_0    pypi
[conda] torchvision               0.18.0.dev20240314+cu121          pypi_0    pypi
[conda] triton                    2.2.0                    pypi_0    pypi
belerico commented 2 months ago

I've run an experiment on 4 A10G on a Lightning Studio with the following command:

python sheeprl.py exp=dreamer_v3_100k_ms_pacman fabric.devices=4 fabric.precision=32 fabric.accelerator=gpu

where I've manually changed the following:

It has run in less than 4 hours and those are the results:

image

The test on different seeds are:

seed 5: 1630
seed 1024: 1160
seed 42: 1330
seed 1337: 1650
seed 8: 1170
seed 2: 930

with a 1311.67 +- 259.77 average reward.

michele-milesi commented 2 months ago

I have run a walker walk training with the commit: fab9f4858dd14993ded7055773eff670ab363de2 (SiLU as activation function). These are the configs that have been used for training, I have run without the decoupled RSSM. The command used is the following:

python sheeprl.py exp=dreamer_v3_dmc_walker_walk
compile comparison

The green line is the compiled experiment described above (with ReLU, decoupled RSSM, and the d640a41cc9a959e28e8f052376fc5e808ee0da1d commit). The grey line is the new experiment.

My env is:

$ python -m torch.utils.collect_env

Collecting environment information...
PyTorch version: 2.3.0.dev20240314+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.2 LTS (x86_64)
GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
Clang version: Could not collect
CMake version: version 3.27.7
Libc version: glibc-2.35

Python version: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.19.0-46-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.5.119
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3080
Nvidia driver version: 535.54.03
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Address sizes:                   39 bits physical, 48 bits virtual
Byte Order:                      Little Endian
CPU(s):                          12
On-line CPU(s) list:             0-11
Vendor ID:                       GenuineIntel
Model name:                      Intel(R) Core(TM) i7-8700T CPU @ 2.40GHz
CPU family:                      6
Model:                           158
Thread(s) per core:              2
Core(s) per socket:              6
Socket(s):                       1
Stepping:                        10
CPU max MHz:                     2400,0000
CPU min MHz:                     800,0000
BogoMIPS:                        4800.00
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
Virtualization:                  VT-x
L1d cache:                       192 KiB (6 instances)
L1i cache:                       192 KiB (6 instances)
L2 cache:                        1,5 MiB (6 instances)
L3 cache:                        12 MiB (1 instance)
NUMA node(s):                    1
NUMA node0 CPU(s):               0-11
Vulnerability Itlb multihit:     KVM: Mitigation: VMX disabled
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Mmio stale data:   Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed:          Mitigation; IBRS
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Mitigation; Microcode
Vulnerability Tsx async abort:   Mitigation; TSX disabled

Versions of relevant libraries:
[pip3] mypy==1.2.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.3
[pip3] pytorch-lightning==2.1.3
[pip3] pytorch-triton==3.0.0+989adb9a29
[pip3] torch==2.3.0.dev20240314+cu121
[pip3] torch-tb-profiler==0.4.3
[pip3] torchmetrics==1.3.0
[pip3] torchvision==0.18.0.dev20240314+cu121
[pip3] triton==2.2.0
[conda] numpy                     1.26.3                   pypi_0    pypi
[conda] pytorch-lightning         2.1.3                    pypi_0    pypi
[conda] pytorch-triton            3.0.0+989adb9a29          pypi_0    pypi
[conda] torch                     2.3.0.dev20240314+cu121          pypi_0    pypi
[conda] torch-tb-profiler         0.4.3                    pypi_0    pypi
[conda] torchmetrics              1.3.0                    pypi_0    pypi
[conda] torchvision               0.18.0.dev20240314+cu121          pypi_0    pypi
[conda] triton                    2.2.0                    pypi_0    pypi
geranim0 commented 2 months ago

Hi,

Thanks for this. Was testing this out, disable: False works fine, but with disable: True for all compile configs, getting this error:

Error executing job with overrides: ['exp=dreamer_v3', 'env=gym', 'env.id=CartPole-v1', 'env.num_envs=4', 'fabric.accelerator=gpu', 'fabric.precision=32-true', 'algo=dreamer_v3_S', 'algo.learning_starts=1024', 'algo.cnn_keys.encoder=[]', 'algo.mlp_keys.encoder=[vector]', 'algo.cnn_keys.decoder=[]', 'algo.mlp_keys.decoder=[vector]', 'algo.per_rank_sequence_length=64', 'algo.replay_ratio=0.5', 'algo.world_model.decoupled_rssm=False', 'algo.world_model.learnable_initial_recurrent_state=False']
Traceback (most recent call last):
  File "/home/sam/dev/sheeprl/sheeprl/cli.py", line 352, in run
    run_algorithm(cfg)
  File "/home/sam/dev/sheeprl/sheeprl/cli.py", line 190, in run_algorithm
    fabric.launch(reproducible(command), cfg, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 859, in launch
    return self._wrap_and_launch(function, self, *args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 945, in _wrap_and_launch
    return to_run(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 950, in _wrap_with_setup
    return to_run(*args, **kwargs)
  File "/home/sam/dev/sheeprl/sheeprl/cli.py", line 186, in wrapper
    return func(fabric, cfg, *args, **kwargs)
  File "/home/sam/dev/sheeprl/sheeprl/algos/dreamer_v3/dreamer_v3.py", line 758, in main
    train(
  File "/home/sam/dev/sheeprl/sheeprl/algos/dreamer_v3/dreamer_v3.py", line 341, in train
    policies: Sequence[Distribution] = actor(imagined_trajectories.detach())[1]
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/lightning/fabric/wrappers.py", line 139, in forward
    output = self._forward_module(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 387, in _fn
    return fn(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 977, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state, skip=1)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 411, in _convert_frame_assert
    return _compile(
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_utils_internal.py", line 70, in wrapper_function
    return function(*args, **kwargs)
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 700, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 266, in time_wrapper
    r = func(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 568, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1116, in transform_code_object
    transformations(instructions, code_options)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 173, in _fn
    return fn(*args, **kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 515, in transform
    tracer.run()
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2237, in run
    super().run()
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 875, in run
    while self.step():
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 790, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 492, in wrapper
    return inner_fn(self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1260, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 730, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 339, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 736, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2418, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2534, in inline_call_
    tracer.run()
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 875, in run
    while self.step():
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 790, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 492, in wrapper
    return inner_fn(self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1260, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 730, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 339, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 736, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2418, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2534, in inline_call_
    tracer.run()
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 875, in run
    while self.step():
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 790, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 492, in wrapper
    return inner_fn(self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1260, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 730, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 339, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 736, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2418, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2534, in inline_call_
    tracer.run()
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 875, in run
    while self.step():
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 790, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 492, in wrapper
    return inner_fn(self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1260, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 730, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 339, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 736, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2418, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2534, in inline_call_
    tracer.run()
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 875, in run
    while self.step():
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 790, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 492, in wrapper
    return inner_fn(self, inst)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1260, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 730, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/user_defined.py", line 440, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/variables/base.py", line 294, in call_function
    unimplemented(f"call_function {self} {args} {kwargs}")
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/_dynamo/exc.py", line 212, in unimplemented
    raise Unsupported(msg)
torch._dynamo.exc.Unsupported: call_function UserDefinedClassVariable(<class 'torch.Size'>) [SizeVariable()] {}

from user code:
   File "/home/sam/dev/sheeprl/sheeprl/algos/dreamer_v3/agent.py", line 838, in forward
    actions[i] = actions_dist[i].rsample()
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/distributions/one_hot_categorical.py", line 127, in rsample
    samples = self.sample(sample_shape)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/distributions/one_hot_categorical.py", line 95, in sample
    indices = self._categorical.sample(sample_shape)
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/distributions/categorical.py", line 133, in sample
    return samples_2d.reshape(self._extended_shape(sample_shape))
  File "/home/sam/dev/sheeprl/.venv/lib/python3.10/site-packages/torch/distributions/distribution.py", line 268, in _extended_shape
    return torch.Size(sample_shape + self._batch_shape + self._event_shape)

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
belerico commented 2 months ago

Hi @geranim0, you need to update PyTorch to the nightly build (torch>=2.3) where they have fixed the one_hot. Could you update the pytorch version and tell us your results? Thank you

geranim0 commented 2 months ago

Hi @belerico ,

Yes I did replace the stock torch with the nightly, yielding this config

(.venv) sam@oldub:~/dev/sheeprl$ python -m torch.utils.collect_env
/usr/lib/python3.10/runpy.py:126: RuntimeWarning: 'torch.utils.collect_env' found in sys.modules after import of package 'torch.utils', but prior to execution of 'torch.utils.collect_env'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
Collecting environment information...
PyTorch version: 2.4.0.dev20240418+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-6.5.0-27-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 2070
Nvidia driver version: 545.29.06
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      39 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             4
On-line CPU(s) list:                0-3
Vendor ID:                          GenuineIntel
Model name:                         Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz
CPU family:                         6
Model:                              60
Thread(s) per core:                 1
Core(s) per socket:                 4
Socket(s):                          1
Stepping:                           3
CPU max MHz:                        3900.0000
CPU min MHz:                        800.0000
BogoMIPS:                           7000.00
Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts vnmi md_clear flush_l1d
Virtualization:                     VT-x
L1d cache:                          128 KiB (4 instances)
L1i cache:                          128 KiB (4 instances)
L2 cache:                           1 MiB (4 instances)
L3 cache:                           6 MiB (1 instance)
NUMA node(s):                       1
NUMA node0 CPU(s):                  0-3
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit:        KVM: Mitigation: VMX disabled
Vulnerability L1tf:                 Mitigation; PTE Inversion; VMX conditional cache flushes, SMT disabled
Vulnerability Mds:                  Mitigation; Clear CPU buffers; SMT disabled
Vulnerability Meltdown:             Mitigation; PTI
Vulnerability Mmio stale data:      Unknown: No mitigations
Vulnerability Retbleed:             Not affected
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:           Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:                Mitigation; Microcode
Vulnerability Tsx async abort:      Not affected

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] pytorch-lightning==2.2.2
[pip3] pytorch-triton==3.0.0+989adb9a29
[pip3] torch==2.4.0.dev20240418+cu121
[pip3] torchaudio==2.2.0.dev20240418+cu121
[pip3] torchmetrics==1.3.2
[pip3] torchvision==0.19.0.dev20240418+cu121
[pip3] triton==2.2.0
[conda] Could not collect
defrag-bambino commented 1 week ago

I've just checked out the branch and tested around a bit using my own gym env. DreamerV3_S.

For a replay_ratio of 0.1 I am seeing a 3-4x speedup (dark blue): image

For a replay_ratio of 5.0 it is around 2x (light blue vs. dark red): image

In both cases, GPU utilization hovers around 60%:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04             Driver Version: 535.171.04   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:01:00.0  On |                  Off |
| 30%   53C    P2             174W / 450W |   4525MiB / 24564MiB |     61%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1889      G   /usr/lib/xorg/Xorg                          183MiB |
|    0   N/A  N/A      2030      G   /usr/bin/gnome-shell                         44MiB |
|    0   N/A  N/A      5490      G   ...irefox/4451/usr/lib/firefox/firefox      291MiB |
|    0   N/A  N/A     35460      G   ...erProcess --variations-seed-version       61MiB |
|    0   N/A  N/A     72175      C   .../miniconda3/envs/sheeprl/bin/python     3844MiB |
+---------------------------------------------------------------------------------------+
My python env is:

PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: Could not collect CMake version: version 3.22.1 Libc version: glibc-2.35 Python version: 3.10.12 (main, Jul 5 2023, 18:54:27) [GCC 11.2.0] (64-bit runtime) Python platform: Linux-6.5.0-35-generic-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4090 Nvidia driver version: 535.171.04 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Vendor ID: AuthenticAMD Model name: AMD Ryzen 9 7900X 12-Core Processor CPU family: 25 Model: 97 Thread(s) per core: 2 Core(s) per socket: 12 Socket(s): 1 Stepping: 2 CPU max MHz: 5733,0000 CPU min MHz: 400,0000 BogoMIPS: 9400.12 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca fsrm flush_l1d Virtualization: AMD-V L1d cache: 384 KiB (12 instances) L1i cache: 384 KiB (12 instances) L2 cache: 12 MiB (12 instances) L3 cache: 64 MiB (2 instances) NUMA node(s): 1 NUMA node0 CPU(s): 0-23 Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec rstack overflow: Vulnerable: Safe RET, no microcode Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Versions of relevant libraries: [pip3] numpy==1.26.4 [pip3] onnx==1.12.0 [pip3] pytorch-lightning==2.2.1 [pip3] torch==2.3.1 [pip3] torchmetrics==1.3.2 [pip3] torchvision==0.18.1 [pip3] triton==2.3.1 [conda] numpy 1.26.4 pypi_0 pypi [conda] pytorch-lightning 2.2.1 pypi_0 pypi [conda] torch 2.3.1 pypi_0 pypi [conda] torchmetrics 1.3.2 pypi_0 pypi [conda] torchvision 0.18.1 pypi_0 pypi [conda] triton 2.3.1 pypi_0 pypi

defrag-bambino commented 1 week ago

I cannot sheeprl-eval my trained model, since the keys in the world model's state_dict have different names:

Stacktrace

Error executing job with overrides: ['checkpoint_path=/home/drt/Desktop/sheeprl/sheeprl/logs/runs/dreamer_v3/PyFlyt/2024-06-23_19-34-31_dreamer_v3_PyFlyt_42/version_0/checkpoint/ckpt_730000_0.ckpt', 'fabric.accelerator=gpu', 'env.capture_video=True', 'seed=52'] Traceback (most recent call last): File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/cli.py", line 404, in evaluation eval_algorithm(ckpt_cfg) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/cli.py", line 267, in eval_algorithm fabric.launch(command, cfg, state) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 839, in launch return self._wrap_and_launch(function, self, *args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 925, in _wrap_and_launch return to_run(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 930, in _wrap_with_setup return to_run(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/cli.py", line 262, in wrapper return func(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/algos/dreamer_v3/evaluate.py", line 47, in evaluate _, _, _, _, player = build_agent( File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/algos/dreamer_v3/agent.py", line 1186, in build_agent world_model.load_state_dict(world_model_state) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2189, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for WorldModel: Missing key(s) in state_dict: "encoder.mlp_encoder.model._model.0.weight", "encoder.mlp_encoder.model._model.1.weight", "encoder.mlp_encoder.model._model.1.bias", "encoder.mlp_encoder.model._model.3.weight", "encoder.mlp_encoder.model._model.4.weight", "encoder.mlp_encoder.model._model.4.bias", "encoder.mlp_encoder.model._model.6.weight", "encoder.mlp_encoder.model._model.7.weight", "encoder.mlp_encoder.model._model.7.bias", "rssm.recurrent_model.mlp._model.0.weight", "rssm.recurrent_model.mlp._model.1.weight", "rssm.recurrent_model.mlp._model.1.bias", "rssm.recurrent_model.rnn.linear.weight", "rssm.recurrent_model.rnn.layer_norm.weight", "rssm.recurrent_model.rnn.layer_norm.bias", "rssm.representation_model._model.0.weight", "rssm.representation_model._model.1.weight", "rssm.representation_model._model.1.bias", "rssm.representation_model._model.3.weight", "rssm.representation_model._model.3.bias", "rssm.transition_model._model.0.weight", "rssm.transition_model._model.1.weight", "rssm.transition_model._model.1.bias", "rssm.transition_model._model.3.weight", "rssm.transition_model._model.3.bias", "observation_model.mlp_decoder.model._model.0.weight", "observation_model.mlp_decoder.model._model.1.weight", "observation_model.mlp_decoder.model._model.1.bias", "observation_model.mlp_decoder.model._model.3.weight", "observation_model.mlp_decoder.model._model.4.weight", "observation_model.mlp_decoder.model._model.4.bias", "observation_model.mlp_decoder.model._model.6.weight", "observation_model.mlp_decoder.model._model.7.weight", "observation_model.mlp_decoder.model._model.7.bias", "observation_model.mlp_decoder.heads.0.weight", "observation_model.mlp_decoder.heads.0.bias", "reward_model._model.0.weight", "reward_model._model.1.weight", "reward_model._model.1.bias", "reward_model._model.3.weight", "reward_model._model.4.weight", "reward_model._model.4.bias", "reward_model._model.6.weight", "reward_model._model.7.weight", "reward_model._model.7.bias", "reward_model._model.9.weight", "reward_model._model.9.bias". Unexpected key(s) in state_dict: "encoder._orig_mod.mlp_encoder.model._model.0.weight", "encoder._orig_mod.mlp_encoder.model._model.1.weight", "encoder._orig_mod.mlp_encoder.model._model.1.bias", "encoder._orig_mod.mlp_encoder.model._model.3.weight", "encoder._orig_mod.mlp_encoder.model._model.4.weight", "encoder._orig_mod.mlp_encoder.model._model.4.bias", "encoder._orig_mod.mlp_encoder.model._model.6.weight", "encoder._orig_mod.mlp_encoder.model._model.7.weight", "encoder._orig_mod.mlp_encoder.model._model.7.bias", "rssm.recurrent_model._orig_mod.mlp._model.0.weight", "rssm.recurrent_model._orig_mod.mlp._model.1.weight", "rssm.recurrent_model._orig_mod.mlp._model.1.bias", "rssm.recurrent_model._orig_mod.rnn.linear.weight", "rssm.recurrent_model._orig_mod.rnn.layer_norm.weight", "rssm.recurrent_model._orig_mod.rnn.layer_norm.bias", "rssm.representation_model._orig_mod._model.0.weight", "rssm.representation_model._orig_mod._model.1.weight", "rssm.representation_model._orig_mod._model.1.bias", "rssm.representation_model._orig_mod._model.3.weight", "rssm.representation_model._orig_mod._model.3.bias", "rssm.transition_model._orig_mod._model.0.weight", "rssm.transition_model._orig_mod._model.1.weight", "rssm.transition_model._orig_mod._model.1.bias", "rssm.transition_model._orig_mod._model.3.weight", "rssm.transition_model._orig_mod._model.3.bias", "observation_model._orig_mod.mlp_decoder.model._model.0.weight", "observation_model._orig_mod.mlp_decoder.model._model.1.weight", "observation_model._orig_mod.mlp_decoder.model._model.1.bias", "observation_model._orig_mod.mlp_decoder.model._model.3.weight", "observation_model._orig_mod.mlp_decoder.model._model.4.weight", "observation_model._orig_mod.mlp_decoder.model._model.4.bias", "observation_model._orig_mod.mlp_decoder.model._model.6.weight", "observation_model._orig_mod.mlp_decoder.model._model.7.weight", "observation_model._orig_mod.mlp_decoder.model._model.7.bias", "observation_model._orig_mod.mlp_decoder.heads.0.weight", "observation_model._orig_mod.mlp_decoder.heads.0.bias", "reward_model._orig_mod._model.0.weight", "reward_model._orig_mod._model.1.weight", "reward_model._orig_mod._model.1.bias", "reward_model._orig_mod._model.3.weight", "reward_model._orig_mod._model.4.weight", "reward_model._orig_mod._model.4.bias", "reward_model._orig_mod._model.6.weight", "reward_model._orig_mod._model.7.weight", "reward_model._orig_mod._model.7.bias", "reward_model._orig_mod._model.9.weight", "reward_model._orig_mod._model.9.bias".