when i was training, encountered bug
ValueError: Expected parameter loc (Tensor of shape (4096, 10)) of distribution Normal(loc: torch.Size([4096, 10]), scale: torch.Size([4096, 10])) to satisfy the constraint Real(), but found invalid values:
tensor([[ 4.0973, 0.7101, -3.0730, ..., -0.6485, -0.6761, 3.0643],
[ 1.2121, 0.2714, -1.4260, ..., -0.3167, 0.8034, 4.1596],
[ 3.2726, 1.1336, -2.3498, ..., -1.4837, -0.5930, 4.0263],
...,
[ 2.8068, 0.9866, -1.3783, ..., -1.7024, -0.9354, 4.2436],
[ 1.1521, 0.7460, -1.4709, ..., -0.8727, -0.5724, 5.1031],
[ 1.3514, 0.7393, -1.3382, ..., -0.9008, -0.3370, 4.2194]],
so i detected the observation and the output of actor network, result
observation is NaN, Warning: mean contains NaN
and the reward
################################
Learning iteration 239/4001
Computation: 69062 steps/s (collection: 3.154s, learning 0.404s)
Value function loss: 0.0271
Surrogate loss: 0.0036
Mean action noise std: 0.64
Mean reward: 25.26
Mean episode length: 1013.32
Mean episode rew_action_smoothness: -0.0163
Mean episode rew_base_acc: 0.0445
Mean episode rew_base_height: 0.0000
Mean episode rew_collision: -0.0003
Mean episode rew_default_joint_pos: 0.0018
Mean episode rew_dof_acc: -9366562581883936988725248.0000
Mean episode rew_dof_vel: -2341654480529039529345024.0000
Mean episode rew_feet_air_time: 0.0004
Mean episode rew_feet_clearance: 0.0004
Mean episode rew_feet_contact_forces: -0.0018
Mean episode rew_feet_contact_number: 0.2137
Mean episode rew_feet_distance: 0.1234
Mean episode rew_foot_slip: -0.0189
Mean episode rew_joint_pos: -0.0779
Mean episode rew_knee_distance: 0.1132
Mean episode rew_low_speed: -0.0726
Mean episode rew_orientation: 0.0088
Mean episode rew_torques: -0.0330
Mean episode rew_track_vel_hard: -4861542400.0000
Mean episode rew_tracking_ang_vel: 0.1980
Mean episode rew_tracking_lin_vel: 0.4897
Mean episode rew_vel_mismatch_exp: 0.1498
Total timesteps: 58982400
Iteration time: 3.56s
Total time: 789.17s
ETA: 12370.3s
noticed that there are some terms just exploded into abnormal value.
I initially thought it might because of the collision between links, but after i delete the collision in robot urdf model, this still happens.
when i was training, encountered bug ValueError: Expected parameter loc (Tensor of shape (4096, 10)) of distribution Normal(loc: torch.Size([4096, 10]), scale: torch.Size([4096, 10])) to satisfy the constraint Real(), but found invalid values: tensor([[ 4.0973, 0.7101, -3.0730, ..., -0.6485, -0.6761, 3.0643], [ 1.2121, 0.2714, -1.4260, ..., -0.3167, 0.8034, 4.1596], [ 3.2726, 1.1336, -2.3498, ..., -1.4837, -0.5930, 4.0263], ..., [ 2.8068, 0.9866, -1.3783, ..., -1.7024, -0.9354, 4.2436], [ 1.1521, 0.7460, -1.4709, ..., -0.8727, -0.5724, 5.1031], [ 1.3514, 0.7393, -1.3382, ..., -0.9008, -0.3370, 4.2194]],
so i detected the observation and the output of actor network, result observation is NaN, Warning: mean contains NaN
and the reward ################################ Learning iteration 239/4001
Mean episode rew_action_smoothness: -0.0163 Mean episode rew_base_acc: 0.0445 Mean episode rew_base_height: 0.0000 Mean episode rew_collision: -0.0003 Mean episode rew_default_joint_pos: 0.0018 Mean episode rew_dof_acc: -9366562581883936988725248.0000 Mean episode rew_dof_vel: -2341654480529039529345024.0000 Mean episode rew_feet_air_time: 0.0004 Mean episode rew_feet_clearance: 0.0004 Mean episode rew_feet_contact_forces: -0.0018 Mean episode rew_feet_contact_number: 0.2137 Mean episode rew_feet_distance: 0.1234 Mean episode rew_foot_slip: -0.0189 Mean episode rew_joint_pos: -0.0779 Mean episode rew_knee_distance: 0.1132 Mean episode rew_low_speed: -0.0726 Mean episode rew_orientation: 0.0088 Mean episode rew_torques: -0.0330 Mean episode rew_track_vel_hard: -4861542400.0000 Mean episode rew_tracking_ang_vel: 0.1980 Mean episode rew_tracking_lin_vel: 0.4897 Mean episode rew_vel_mismatch_exp: 0.1498
noticed that there are some terms just exploded into abnormal value.
I initially thought it might because of the collision between links, but after i delete the collision in robot urdf model, this still happens.
what might be the cause of this?