Eclectic-Sheep / sheeprl

Distributed Reinforcement Learning accelerated by Lightning Fabric
https://eclecticsheep.ai
Apache License 2.0
274 stars 26 forks source link

AttributeError: 'NoneType' object has no attribute 'cls' #254

Closed LucaVendruscolo closed 3 months ago

LucaVendruscolo commented 3 months ago

Hello. I have been getting this error "AttributeError: 'NoneType' object has no attribute 'cls'" every time I try to run DreamerV3 using: python sheeprl.py exp=dreamer_v3 env=BallGame algo.mlp_keys.encoder=[position,QR_position] algo.mlp_keys.encoder=[position,QR_position] algo.cnn_keys.encoder=[] algo.cnn_keys.decoder=[]

this code is there:

The error message ```shell (RL17) C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl>python sheeprl.py exp=dreamer_v3 env=BallGame algo.mlp_keys.encoder=[position,QR_position] algo.mlp_keys.decoder=[position,QR_position] algo.cnn_keys.encoder=[] algo.cnn_keys.decoder=[] CONFIG ├── algo │ └── name: dreamer_v3 │ total_steps: 5000000 │ per_rank_batch_size: 16 │ run_test: true │ cnn_keys: │ encoder: [] │ decoder: [] │ mlp_keys: │ encoder: │ - position │ - QR_position │ decoder: │ - position │ - QR_position │ world_model: │ optimizer: │ _target_: torch.optim.Adam │ lr: 0.0001 │ eps: 1.0e-08 │ weight_decay: 0 │ betas: │ - 0.9 │ - 0.999 │ discrete_size: 32 │ stochastic_size: 32 │ kl_dynamic: 0.5 │ kl_representation: 0.1 │ kl_free_nats: 1.0 │ kl_regularizer: 1.0 │ continue_scale_factor: 1.0 │ clip_gradients: 1000.0 │ encoder: │ cnn_channels_multiplier: 32 │ cnn_act: torch.nn.SiLU │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ recurrent_model: │ recurrent_state_size: 512 │ layer_norm: true │ dense_units: 512 │ transition_model: │ hidden_size: 512 │ dense_act: torch.nn.SiLU │ layer_norm: true │ representation_model: │ hidden_size: 512 │ dense_act: torch.nn.SiLU │ layer_norm: true │ observation_model: │ cnn_channels_multiplier: 32 │ cnn_act: torch.nn.SiLU │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ reward_model: │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ bins: 255 │ discount_model: │ learnable: true │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ actor: │ optimizer: │ _target_: torch.optim.Adam │ lr: 8.0e-05 │ eps: 1.0e-05 │ weight_decay: 0 │ betas: │ - 0.9 │ - 0.999 │ cls: sheeprl.algos.dreamer_v3.agent.Actor │ ent_coef: 0.0003 │ min_std: 0.1 │ init_std: 0.0 │ objective_mix: 1.0 │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ clip_gradients: 100.0 │ expl_amount: 0.0 │ expl_min: 0.0 │ expl_decay: false │ max_step_expl_decay: 0 │ moments: │ decay: 0.99 │ max: 1.0 │ percentile: │ low: 0.05 │ high: 0.95 │ critic: │ optimizer: │ _target_: torch.optim.Adam │ lr: 8.0e-05 │ eps: 1.0e-05 │ weight_decay: 0 │ betas: │ - 0.9 │ - 0.999 │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ target_network_update_freq: 1 │ tau: 0.02 │ bins: 255 │ clip_gradients: 100.0 │ gamma: 0.996996996996997 │ lmbda: 0.95 │ horizon: 15 │ train_every: 16 │ learning_starts: 20000 │ per_rank_pretrain_steps: 1 │ per_rank_gradient_steps: 1 │ per_rank_sequence_length: 64 │ layer_norm: true │ dense_units: 512 │ mlp_layers: 2 │ dense_act: torch.nn.SiLU │ cnn_act: torch.nn.SiLU │ unimix: 0.01 │ hafner_initialization: true │ decoupled_rssm: false │ player: │ discrete_size: 32 │ replay_ratio: 1 │ ├── buffer │ └── size: 1000000 │ memmap: true │ validate_args: false │ from_numpy: false │ checkpoint: false │ ├── checkpoint │ └── every: 100000 │ resume_from: null │ save_last: true │ keep_last: 5 │ ├── env │ └── id: BallGame │ num_envs: 1 │ frame_stack: 1 │ sync_env: false │ screen_size: 64 │ action_repeat: 1 │ grayscale: false │ clip_rewards: false │ capture_video: false │ frame_stack_dilation: 1 │ max_episode_steps: null │ reward_as_observation: false │ wrapper: │ _target_: sheeprl.envs.BallGame.BallEnv │ render_mode: rgb_array │ ├── fabric │ └── _target_: lightning.fabric.Fabric │ devices: 1 │ num_nodes: 1 │ strategy: auto │ accelerator: cpu │ precision: 32-true │ callbacks: │ - _target_: sheeprl.utils.callback.CheckpointCallback │ keep_last: 5 │ └── metric └── log_every: 5000 disable_timer: false log_level: 1 sync_on_compute: false aggregator: _target_: sheeprl.utils.metric.MetricAggregator raise_on_missing: false metrics: Rewards/rew_avg: _target_: torchmetrics.MeanMetric sync_on_compute: false Game/ep_len_avg: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/world_model_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/value_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/policy_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/observation_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/reward_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/state_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/continue_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false State/kl: _target_: torchmetrics.MeanMetric sync_on_compute: false State/post_entropy: _target_: torchmetrics.MeanMetric sync_on_compute: false State/prior_entropy: _target_: torchmetrics.MeanMetric sync_on_compute: false Grads/world_model: _target_: torchmetrics.MeanMetric sync_on_compute: false Grads/actor: _target_: torchmetrics.MeanMetric sync_on_compute: false Grads/critic: _target_: torchmetrics.MeanMetric sync_on_compute: false logger: _target_: lightning.fabric.loggers.TensorBoardLogger name: 2024-04-03_21-58-34_dreamer_v3_BallGame_42 root_dir: logs/runs/dreamer_v3/BallGame version: null default_hp_metric: true prefix: '' sub_dir: null Seed set to 42 C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\utils\logger.py:22: UserWarning: The specified root directory for the TensorBoardLogger is different from the experiment one, so the logger one will be ignored and replaced with the experiment root directory warnings.warn( Log dir: logs\runs\dreamer_v3/BallGame\2024-04-03_21-58-34_dreamer_v3_BallGame_42\version_0 Connected to Arduino on COM3 [644.3554 403.0136 673.92035 432.0973 ] [521.46893 667.7571 550.81024 694.71533] OBSERVATION {'position': array([536.1395874 , 681.23620605]), 'QR_position': array([659.13787842, 417.55545044])} Connected to Arduino on COM3 [644.3524 403.04965 673.46185 431.46362] [521.156 668.4098 551.45807 693.8184 ] OBSERVATION {'position': array([536.30700684, 681.11413574]), 'QR_position': array([658.90710449, 417.25665283])} Encoder CNN keys: [] Encoder MLP keys: ['position', 'QR_position'] Decoder CNN keys: [] Decoder MLP keys: ['position', 'QR_position'] Error executing job with overrides: ['exp=dreamer_v3', 'env=BallGame', 'algo.mlp_keys.encoder=[position,QR_position]', 'algo.mlp_keys.decoder=[position,QR_position]', 'algo.cnn_keys.encoder=[]', 'algo.cnn_keys.decoder=[]'] Traceback (most recent call last): File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\cli.py", line 352, in run run_algorithm(cfg) File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\cli.py", line 190, in run_algorithm fabric.launch(reproducible(command), cfg, **kwargs) File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 839, in launch return self._wrap_and_launch(function, self, *args, **kwargs) File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 925, in _wrap_and_launch return to_run(*args, **kwargs) File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 930, in _wrap_with_setup return to_run(*args, **kwargs) File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\cli.py", line 186, in wrapper return func(fabric, cfg, *args, **kwargs) File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\algos\dreamer_v3\dreamer_v3.py", line 432, in main world_model, actor, critic, target_critic = build_agent( File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\algos\dreamer_v3\agent.py", line 998, in build_agent layer_norm_cls=hydra.utils.get_class(world_model_cfg.encoder._mlp_layer_norm.cls), AttributeError: 'NoneType' object has no attribute 'cls' Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. ```
The gym environment ```python import gymnasium as gym from gymnasium import spaces import numpy as np import matplotlib.pyplot as plt from sheeprl.envs import BallDetection from sheeprl.envs import RewardCalculation import serial import time import cv2 try: arduino = serial.Serial('COM3', 9600) time.sleep(2) print("Connected to Arduino on COM3") except serial.SerialException as e: print(f"Failed to connect to Arduino on COM3: {e}") class BallEnv(gym.Env): metadata = {'render.modes': ['human', 'rgb_array']} def __init__(self, render_mode=None): super().__init__() self.screen_width = 640 self.screen_height = 480 self.cap, self.model, self.device= BallDetection.StartModelAndCap() self.action_space = gym.spaces.Box(low=np.array([-1, -1]), high=np.array([1, 1]), dtype=np.float32) self.observation_space = gym.spaces.Dict({ "position": gym.spaces.Box(low=np.array([0, 0]), high=np.array([self.screen_width, self.screen_height]), dtype=np.float32), "QR_position": gym.spaces.Box(low=np.array([0, 0]), high=np.array([self.screen_width, self.screen_height]), dtype=np.float32), }) self.lastKnownBallPosition = np.array([0, 0]) self.fail_counter = 0 self.totalReward = 0 self.lastKnownQRPosition = np.array([0, 0]) self.RewardForGames = [] self.waypoints = [[566, 650], [787, 650], [844, 580], [900, 650], [970, 650], [1050, 560], [1060, 500], [950, 400], [1035, 320], [950, 270], [950, 170], [1040, 110], [860, 85], [700, 180], [700, 320], [873, 350], [870, 460], [720, 470], [660, 560], [540, 380], [630, 240], [500, 90], [310, 100], [460, 280], [415, 615], [300, 600], [300, 350]] self.lastKnownProgress = 0.0 self.reset() def step(self, action): low = -60 high = 60 motor1_deg, motor2_deg = low + (0.5 * (action + 1.0) * (high - low)) self.move_motors(motor1_deg, motor2_deg) done = False start_time = time.time() detected_in_this_step = False while time.time() - start_time < 0.4: ret, frame = self.cap.read() if not ret: raise ValueError("Failed to capture frame from the camera.") qr_position, ball_position = BallDetection.geoCoordinates(frame, self.model, self.device) if ball_position is not None: x1, y1, x2, y2 = ball_position self.lastKnownBallPosition = np.array([(x1 + x2) / 2, (y1 + y2) / 2]) detected_in_this_step = True self.fail_counter = 0 if qr_position is not None: x1, y1, x2, y2 = qr_position self.lastKnownQRPosition = np.array([(x1 + x2) / 2, (y1 + y2) / 2]) else: time.sleep(0.01) self.renderFrame(frame) if not detected_in_this_step: self.fail_counter += 1 else: self.fail_counter = 0 if self.fail_counter >= 3: self.RewardForGames.append(self.totalReward) self.totalReward = 0 done = True reward = 0 progress = RewardCalculation.calculate_progress_percentage(self.waypoints, self.lastKnownBallPosition) if self.lastKnownProgress 0: traversed_path_np = np.array(self.waypoints[:progress_index], np.int32) traversed_path_np = traversed_path_np.reshape((-1, 1, 2)) cv2.polylines(frame, [traversed_path_np], isClosed=False, color=(0, 255, 0), thickness=2) cv2.imshow("Frame with Path Overlay", frame) cv2.waitKey(1) def close(self): plt.close() if arduino is not None: arduino.close() def move_motors(self, motor1_deg, motor2_deg): command = f"Move({motor1_deg},{motor2_deg})\n" arduino.write(command.encode()) '''
BallGame.yaml ``` defaults: - default - _self_ id: BallGame action_repeat: 1 capture_video: False reward_as_observation: False num_envs: 1 wrapper: _target_: sheeprl.envs.BallGame.BallEnv render_mode: rgb_array '''

link to the my full sheelRL folder: https://drive.google.com/file/d/1CqTeSRQSl5cfijf9riWyiMHAU2OXXYjB/view?usp=sharing

michele-milesi commented 3 months ago

I think that the problem is in this line: https://github.com/Eclectic-Sheep/sheeprl/blob/0da0194e4c6154ab4496737b6c8d05fdc9e02e14/sheeprl/algos/dreamer_v3/agent.py#L998. Can you try to change

layer_norm_cls=hydra.utils.get_class(world_model_cfg.encoder._mlp_layer_norm.cls),

to

layer_norm_cls=hydra.utils.get_class(world_model_cfg.encoder.mlp_layer_norm.cls),

I will fix this as soon as possible.

Thanks

michele-milesi commented 3 months ago

Hi @LucaVendruscolo, this branch should work.

Let me know, thanks

LucaVendruscolo commented 3 months ago

Hi @michele-milesi. I copy and pasted the code for the dreamer_v3 agent from the new branch into my code. but I still get the same error.

Logs ``` (RL17) C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl>python sheeprl.py exp=dreamer_v3 env=BallGame algo.mlp_keys.encoder=[position,QR_position] algo.mlp_keys.encoder=[position,QR_position] algo.cnn_keys.encoder=[] algo.cnn_keys.decoder=[] CONFIG ├── algo │ └── name: dreamer_v3 │ total_steps: 5000000 │ per_rank_batch_size: 16 │ run_test: true │ cnn_keys: │ encoder: [] │ decoder: [] │ mlp_keys: │ encoder: │ - position │ - QR_position │ decoder: │ - position │ - QR_position │ world_model: │ optimizer: │ _target_: torch.optim.Adam │ lr: 0.0001 │ eps: 1.0e-08 │ weight_decay: 0 │ betas: │ - 0.9 │ - 0.999 │ discrete_size: 32 │ stochastic_size: 32 │ kl_dynamic: 0.5 │ kl_representation: 0.1 │ kl_free_nats: 1.0 │ kl_regularizer: 1.0 │ continue_scale_factor: 1.0 │ clip_gradients: 1000.0 │ encoder: │ cnn_channels_multiplier: 32 │ cnn_act: torch.nn.SiLU │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ recurrent_model: │ recurrent_state_size: 512 │ layer_norm: true │ dense_units: 512 │ transition_model: │ hidden_size: 512 │ dense_act: torch.nn.SiLU │ layer_norm: true │ representation_model: │ hidden_size: 512 │ dense_act: torch.nn.SiLU │ layer_norm: true │ observation_model: │ cnn_channels_multiplier: 32 │ cnn_act: torch.nn.SiLU │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ reward_model: │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ bins: 255 │ discount_model: │ learnable: true │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ actor: │ optimizer: │ _target_: torch.optim.Adam │ lr: 8.0e-05 │ eps: 1.0e-05 │ weight_decay: 0 │ betas: │ - 0.9 │ - 0.999 │ cls: sheeprl.algos.dreamer_v3.agent.Actor │ ent_coef: 0.0003 │ min_std: 0.1 │ init_std: 0.0 │ objective_mix: 1.0 │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ clip_gradients: 100.0 │ expl_amount: 0.0 │ expl_min: 0.0 │ expl_decay: false │ max_step_expl_decay: 0 │ moments: │ decay: 0.99 │ max: 1.0 │ percentile: │ low: 0.05 │ high: 0.95 │ critic: │ optimizer: │ _target_: torch.optim.Adam │ lr: 8.0e-05 │ eps: 1.0e-05 │ weight_decay: 0 │ betas: │ - 0.9 │ - 0.999 │ dense_act: torch.nn.SiLU │ mlp_layers: 2 │ layer_norm: true │ dense_units: 512 │ target_network_update_freq: 1 │ tau: 0.02 │ bins: 255 │ clip_gradients: 100.0 │ gamma: 0.996996996996997 │ lmbda: 0.95 │ horizon: 15 │ train_every: 16 │ learning_starts: 20000 │ per_rank_pretrain_steps: 1 │ per_rank_gradient_steps: 1 │ per_rank_sequence_length: 64 │ layer_norm: true │ dense_units: 512 │ mlp_layers: 2 │ dense_act: torch.nn.SiLU │ cnn_act: torch.nn.SiLU │ unimix: 0.01 │ hafner_initialization: true │ decoupled_rssm: false │ player: │ discrete_size: 32 │ replay_ratio: 1 │ ├── buffer │ └── size: 1000000 │ memmap: true │ validate_args: false │ from_numpy: false │ checkpoint: false │ ├── checkpoint │ └── every: 100000 │ resume_from: null │ save_last: true │ keep_last: 5 │ ├── env │ └── id: BallGame │ num_envs: 1 │ frame_stack: 1 │ sync_env: false │ screen_size: 64 │ action_repeat: 1 │ grayscale: false │ clip_rewards: false │ capture_video: false │ frame_stack_dilation: 1 │ max_episode_steps: null │ reward_as_observation: false │ wrapper: │ _target_: sheeprl.envs.BallGame.BallEnv │ render_mode: rgb_array │ ├── fabric │ └── _target_: lightning.fabric.Fabric │ devices: 1 │ num_nodes: 1 │ strategy: auto │ accelerator: cpu │ precision: 32-true │ callbacks: │ - _target_: sheeprl.utils.callback.CheckpointCallback │ keep_last: 5 │ └── metric └── log_every: 5000 disable_timer: false log_level: 1 sync_on_compute: false aggregator: _target_: sheeprl.utils.metric.MetricAggregator raise_on_missing: false metrics: Rewards/rew_avg: _target_: torchmetrics.MeanMetric sync_on_compute: false Game/ep_len_avg: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/world_model_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/value_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/policy_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/observation_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/reward_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/state_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false Loss/continue_loss: _target_: torchmetrics.MeanMetric sync_on_compute: false State/kl: _target_: torchmetrics.MeanMetric sync_on_compute: false State/post_entropy: _target_: torchmetrics.MeanMetric sync_on_compute: false State/prior_entropy: _target_: torchmetrics.MeanMetric sync_on_compute: false Grads/world_model: _target_: torchmetrics.MeanMetric sync_on_compute: false Grads/actor: _target_: torchmetrics.MeanMetric sync_on_compute: false Grads/critic: _target_: torchmetrics.MeanMetric sync_on_compute: false logger: _target_: lightning.fabric.loggers.TensorBoardLogger name: 2024-04-04_09-12-43_dreamer_v3_BallGame_42 root_dir: logs/runs/dreamer_v3/BallGame version: null default_hp_metric: true prefix: '' sub_dir: null Seed set to 42 C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\utils\logger.py:22: UserWarning: The specified root directory for the TensorBoardLogger is different from the experiment one, so the logger one will be ignored and replaced with the experiment root directory warnings.warn( Log dir: logs\runs\dreamer_v3/BallGame\2024-04-04_09-12-43_dreamer_v3_BallGame_42\version_0 Connected to Arduino on COM3 [672.38367 407.76648 698.8736 435.38016] [545.87085 672.92706 574.0098 703.6562 ] OBSERVATION {'position': array([559.94030762, 688.29162598]), 'QR_position': array([685.62866211, 421.57330322])} Connected to Arduino on COM3 [646.8575 417.90344 685.3542 446.64667] [531.0206 667.7972 557.94086 696.3425 ] OBSERVATION {'position': array([544.48071289, 682.06982422]), 'QR_position': array([666.10583496, 432.27505493])} Encoder CNN keys: [] Encoder MLP keys: ['position', 'QR_position'] Decoder CNN keys: [] Decoder MLP keys: ['position', 'QR_position'] Error executing job with overrides: ['exp=dreamer_v3', 'env=BallGame', 'algo.mlp_keys.encoder=[position,QR_position]', 'algo.mlp_keys.encoder=[position,QR_position]', 'algo.cnn_keys.encoder=[]', 'algo.cnn_keys.decoder=[]'] Traceback (most recent call last): File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\cli.py", line 352, in run run_algorithm(cfg) File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\cli.py", line 190, in run_algorithm fabric.launch(reproducible(command), cfg, **kwargs) File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 839, in launch return self._wrap_and_launch(function, self, *args, **kwargs) File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 925, in _wrap_and_launch return to_run(*args, **kwargs) File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 930, in _wrap_with_setup return to_run(*args, **kwargs) File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\cli.py", line 186, in wrapper return func(fabric, cfg, *args, **kwargs) File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\algos\dreamer_v3\dreamer_v3.py", line 432, in main world_model, actor, critic, target_critic = build_agent( File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\algos\dreamer_v3\agent.py", line 998, in build_agent layer_norm_cls=hydra.utils.get_class(world_model_cfg.encoder.mlp_layer_norm.cls), AttributeError: 'NoneType' object has no attribute 'cls' Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. (RL17) C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl> ```
michele-milesi commented 3 months ago

Your configs are not updated with the new ones: now for DreamerV3, it is necessary to specify which type of layer norm to use and you can also specify its parameters.

An example is shown here: https://github.com/Eclectic-Sheep/sheeprl/blob/main/sheeprl/configs/algo/dreamer_v3.yaml

I suggest to update your code and the configs with the main. Once the configs are updated, then it should work.

LucaVendruscolo commented 3 months ago

Thanks for the help. Progress has been made. It now kind of runs but before it can make any steps it crashes.

Microsoft Windows [Version 10.0.22631.3296]
(c) Microsoft Corporation. All rights reserved.

(RL17) C:\Users\lucav\Downloads\MazeGameIRLSheepRL>set HYDRA_FULL_ERROR=1

(RL17) C:\Users\lucav\Downloads\MazeGameIRLSheepRL>cd sheeprl

(RL17) C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl>python sheeprl.py exp=dreamer_v3 env=BallGame algo.mlp_keys.encoder=[position,QR_position] algo.mlp_keys.encoder=[position,QR_position] algo.cnn_keys.encoder=[] algo.cnn_keys.decoder=[]
CONFIG
├── algo
│ └── name: dreamer_v3
│ total_steps: 5000000
│ per_rank_batch_size: 16
│ run_test: true
│ cnn_keys:
│ encoder: []
│ decoder: []
│ mlp_keys:
│ encoder:
│ - position
│ - QR_position
│ decoder:
│ - position
│ - QR_position
│ world_model:
│ optimizer:
│ target: torch.optim.Adam
│ lr: 0.0001
│ eps: 1.0e-08
│ weight_decay: 0
│ betas:
│ - 0.9
│ - 0.999
│ discrete_size: 32
│ stochastic_size: 32
│ kl_dynamic: 0.5
│ kl_representation: 0.1
│ kl_free_nats: 1.0
│ kl_regularizer: 1.0
│ continue_scale_factor: 1.0
│ clip_gradients: 1000.0
│ encoder:
│ cnn_channels_multiplier: 32
│ cnn_act: torch.nn.SiLU
│ dense_act: torch.nn.SiLU
│ mlp_layers: 2
│ cnn_layer_norm:
│ cls: sheeprl.utils.model.LayerNormChannelLastFP32
│ kw:
│ eps: 0.001
│ mlp_layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ dense_units: 512
│ recurrent_model:
│ recurrent_state_size: 512
│ layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ dense_units: 512
│ transition_model:
│ hidden_size: 512
│ dense_act: torch.nn.SiLU
│ layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ representation_model:
│ hidden_size: 512
│ dense_act: torch.nn.SiLU
│ layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ observation_model:
│ cnn_channels_multiplier: 32
│ cnn_act: torch.nn.SiLU
│ dense_act: torch.nn.SiLU
│ mlp_layers: 2
│ cnn_layer_norm:
│ cls: sheeprl.utils.model.LayerNormChannelLastFP32
│ kw:
│ eps: 0.001
│ mlp_layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ dense_units: 512
│ reward_model:
│ dense_act: torch.nn.SiLU
│ mlp_layers: 2
│ layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ dense_units: 512
│ bins: 255
│ discount_model:
│ learnable: true
│ dense_act: torch.nn.SiLU
│ mlp_layers: 2
│ layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ dense_units: 512
│ actor:
│ optimizer:
│ target: torch.optim.Adam
│ lr: 8.0e-05
│ eps: 1.0e-05
│ weight_decay: 0
│ betas:
│ - 0.9
│ - 0.999
│ cls: sheeprl.algos.dreamer_v3.agent.Actor
│ ent_coef: 0.0003
│ min_std: 0.1
│ max_std: 1.0
│ init_std: 2.0
│ dense_act: torch.nn.SiLU
│ mlp_layers: 2
│ layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ dense_units: 512
│ clip_gradients: 100.0
│ unimix: 0.01
│ action_clip: 1.0
│ moments:
│ decay: 0.99
│ max: 1.0
│ percentile:
│ low: 0.05
│ high: 0.95
│ critic:
│ optimizer:
│ target: torch.optim.Adam
│ lr: 8.0e-05
│ eps: 1.0e-05
│ weight_decay: 0
│ betas:
│ - 0.9
│ - 0.999
│ dense_act: torch.nn.SiLU
│ mlp_layers: 2
│ layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ dense_units: 512
│ per_rank_target_network_update_freq: 1
│ tau: 0.02
│ bins: 255
│ clip_gradients: 100.0
│ gamma: 0.996996996996997
│ lmbda: 0.95
│ horizon: 15
│ replay_ratio: 1
│ learning_starts: 1024
│ per_rank_pretrain_steps: 0
│ per_rank_sequence_length: 64
│ cnn_layer_norm:
│ cls: sheeprl.utils.model.LayerNormChannelLastFP32
│ kw:
│ eps: 0.001
│ mlp_layer_norm:
│ cls: sheeprl.utils.model.LayerNormFP32
│ kw:
│ eps: 0.001
│ dense_units: 512
│ mlp_layers: 2
│ dense_act: torch.nn.SiLU
│ cnn_act: torch.nn.SiLU
│ unimix: 0.01
│ hafner_initialization: true
│ decoupled_rssm: false
│ player:
│ discrete_size: 32
│
├── buffer
│ └── size: 1000000
│ memmap: true
│ validate_args: false
│ from_numpy: false
│ checkpoint: false
│
├── checkpoint
│ └── every: 100000
│ resume_from: null
│ save_last: true
│ keep_last: 5
│
├── env
│ └── id: BallGame
│ num_envs: 1
│ frame_stack: 1
│ sync_env: false
│ screen_size: 64
│ action_repeat: 1
│ grayscale: false
│ clip_rewards: false
│ capture_video: false
│ frame_stack_dilation: 1
│ max_episode_steps: null
│ reward_as_observation: false
│ wrapper:
│ target: sheeprl.envs.BallGame.BallEnv
│ render_mode: rgb_array
│
├── fabric
│ └── target: lightning.fabric.Fabric
│ devices: 1
│ num_nodes: 1
│ strategy: auto
│ accelerator: cpu
│ precision: 32-true
│ callbacks:
│ - target: sheeprl.utils.callback.CheckpointCallback
│ keep_last: 5
│
└── metric
└── log_every: 5000
disable_timer: false
log_level: 1
sync_on_compute: false
aggregator:
target: sheeprl.utils.metric.MetricAggregator
raise_on_missing: false
metrics:
Rewards/rew_avg:
target: torchmetrics.MeanMetric
sync_on_compute: false
Game/ep_len_avg:
target: torchmetrics.MeanMetric
sync_on_compute: false
Loss/world_model_loss:
target: torchmetrics.MeanMetric
sync_on_compute: false
Loss/value_loss:
target: torchmetrics.MeanMetric
sync_on_compute: false
Loss/policy_loss:
target: torchmetrics.MeanMetric
sync_on_compute: false
Loss/observation_loss:
target: torchmetrics.MeanMetric
sync_on_compute: false
Loss/reward_loss:
target: torchmetrics.MeanMetric
sync_on_compute: false
Loss/state_loss:
target: torchmetrics.MeanMetric
sync_on_compute: false
Loss/continue_loss:
target: torchmetrics.MeanMetric
sync_on_compute: false
State/kl:
target: torchmetrics.MeanMetric
sync_on_compute: false
State/post_entropy:
target: torchmetrics.MeanMetric
sync_on_compute: false
State/prior_entropy:
target: torchmetrics.MeanMetric
sync_on_compute: false
Grads/world_model:
target: torchmetrics.MeanMetric
sync_on_compute: false
Grads/actor:
target: torchmetrics.MeanMetric
sync_on_compute: false
Grads/critic:
target: torchmetrics.MeanMetric
sync_on_compute: false
logger:
target: lightning.fabric.loggers.TensorBoardLogger
name: 2024-04-04_12-40-33_dreamer_v3_BallGame_42
root_dir: logs/runs/dreamer_v3/BallGame
version: null
default_hp_metric: true
prefix: ''
sub_dir: null

Seed set to 42
C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\utils\logger.py:22: UserWarning: The specified root directory for the TensorBoardLogger is different from the experiment one, so the logger one will be ignored and replaced with the experiment root directory
warnings.warn(
Log dir: logs\runs\dreamer_v3/BallGame\2024-04-04_12-40-33_dreamer_v3_BallGame_42\version_0
Connected to Arduino on COM3
[692.8363 403.61932 720.07117 431.96716] [813.6614 676.17883 841.2084 707.32794]
OBSERVATION {'position': array([827.43487549, 691.75341797]), 'QR_position': array([706.45373535, 417.79324341])}
Connected to Arduino on COM3
[689.48425 404.75378 716.19574 433.07486] [812.3025 676.87726 840.56757 707.3656 ]
OBSERVATION {'position': array([826.43505859, 692.12145996]), 'QR_position': array([702.83996582, 418.91430664])}
Encoder CNN keys: []
Encoder MLP keys: ['position', 'QR_position']
Decoder CNN keys: []
Decoder MLP keys: ['position', 'QR_position']
[690.23425 404.1878 716.75494 432.76288] [812.7107 676.9479 840.5307 707.0191]
OBSERVATION {'position': array([826.62072754, 691.98352051]), 'QR_position': array([703.49462891, 418.4753418 ])}
C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\envs\wrappers.py:96: UserWarning: WARN: STEP - Restarting env after crash with TypeError: unsupported operand type(s) for -: 'list' and 'list'
gym.logger.warn(f"STEP - Restarting env after crash with {type(e).name}: {e}")
[ WARN:1@50.407] global cap_msmf.cpp:471 anonymous-namespace'::SourceReaderCB::OnReadSample videoio(MSMF): OnReadSample() is called with error status: -1072873821 [ WARN:1@50.409] global cap_msmf.cpp:483 anonymous-namespace'::SourceReaderCB::OnReadSample videoio(MSMF): async ReadSample() call is failed with error status: -1072873821
[690.88416 403.982 717.8092 432.92264] [812.48804 676.5464 840.8072 707.33075]
OBSERVATION {'position': array([826.64758301, 691.93859863]), 'QR_position': array([704.34667969, 418.45233154])}
[691.3308 404.61163 718.94745 432.84036] [812.4885 676.6912 840.40875 707.2217 ]
OBSERVATION {'position': array([826.4486084, 691.9564209]), 'QR_position': array([705.13916016, 418.72601318])}
[ WARN:1@86.568] global cap_msmf.cpp:471 anonymous-namespace'::SourceReaderCB::OnReadSample videoio(MSMF): OnReadSample() is called with error status: -1072873821 [ WARN:1@86.570] global cap_msmf.cpp:483 anonymous-namespace'::SourceReaderCB::OnReadSample videoio(MSMF): async ReadSample() call is failed with error status: -1072873821
[690.3189 403.8334 718.6088 433.08102] [813.73016 677.789 841.6107 707.40356]
OBSERVATION {'position': array([827.67041016, 692.59631348]), 'QR_position': array([704.46386719, 418.45721436])}
[690.5427 403.76544 718.5269 432.8585 ] [812.98663 676.79486 840.9965 706.84674]
OBSERVATION {'position': array([826.99157715, 691.82080078]), 'QR_position': array([704.53479004, 418.31195068])}
ERROR: Received the following error from Worker-0: RuntimeError: The env crashed too many times: 3
ERROR: Shutting down Worker-0.
ERROR: Raising the last exception back to the main process.
Error executing job with overrides: ['exp=dreamer_v3', 'env=BallGame', 'algo.mlp_keys.encoder=[position,QR_position]', 'algo.mlp_keys.encoder=[position,QR_position]', 'algo.cnn_keys.encoder=[]', 'algo.cnn_keys.decoder=[]']
Traceback (most recent call last):
File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl.py", line 4, in
run()
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\hydra\main.py", line 90, in decorated_main
_run_hydra(
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\hydra_internal\utils.py", line 394, in _run_hydra
_run_app(
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\hydra_internal\utils.py", line 457, in _run_app
run_and_report(
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\hydra_internal\utils.py", line 222, in run_and_report
raise ex
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\hydra_internal\utils.py", line 219, in run_and_report
return func()
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\hydra_internal\utils.py", line 458, in
lambda: hydra.run(
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\hydra_internal\hydra.py", line 132, in run
_ = ret.return_value
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\hydra\core\utils.py", line 260, in return_value
raise self._return_value
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\hydra\core\utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\cli.py", line 352, in run
run_algorithm(cfg)
File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\cli.py", line 190, in run_algorithm
fabric.launch(reproducible(command), cfg, **kwargs)
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 839, in launch
return self._wrap_and_launch(function, self, *args, **kwargs)
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 925, in _wrap_and_launch
return to_run(*args, **kwargs)
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 930, in _wrap_with_setup
return to_run(*args, **kwargs)
File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\cli.py", line 186, in wrapper
return func(fabric, cfg, *args, **kwargs)
File "C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl\sheeprl\algos\dreamer_v3\dreamer_v3.py", line 601, in main
next_obs, rewards, terminated, truncated, infos = envs.step(
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\gymnasium\vector\vector_env.py", line 204, in step
return self.step_wait()
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\gymnasium\vector\async_vector_env.py", line 333, in step_wait
self._raise_if_errors(successes)
File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\gymnasium\vector\async_vector_env.py", line 544, in _raise_if_errors
raise exctype(value)
RuntimeError: The env crashed too many times: 3
C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\gymnasium\vector\async_vector_env.py:460: UserWarning: WARN: Calling close while waiting for a pending call to step to complete.

(RL17) C:\Users\lucav\Downloads\MazeGameIRLSheepRL\sheeprl>
michele-milesi commented 3 months ago

Hi @LucaVendruscolo, Thank you for your patience. This problem is not related to SheepRL, but to your environment. It seems to be something related to the VideoCapture() method of the OpenCV library. Unfortunately, my experience with this library is limited. However, I'll try to leave you some references I found:

I do not know if they can be helpful.

Another thing you can try is to run the experiment with the env.sync_env=True parameter: in some cases, it was useful for us to solve some problems related to the DMC environments.

Let me know, thanks

cc: @belerico

LucaVendruscolo commented 3 months ago

Hello. Thanks so much for the help. I got everything working! It was just bad programming from my end. I couldn't understand the error messages so I ended up fixing the problem using print statements.