chengxuxin / extreme-parkour

Train your parkour robot in less than 20 hours.
https://extreme-parkour.github.io
Other
472 stars 85 forks source link

CUDA error: device-side assert triggered #20

Closed ChengEeee closed 5 months ago

ChengEeee commented 6 months ago

" File "train.py", line 66, in train ppo_runner.learn(num_learning_iterations=train_cfg.runner.max_iterations, init_at_random_ep_len=True) File "/home/hello/Aaron_Project/extreme-parkour/rsl_rl/rsl_rl/runners/on_policy_runner.py", line 165, in learn_RL obs, privileged_obs, rewards, dones, infos = self.env.step(actions) # obs has changed to next_obs !! if done obs has been reset File "/home/hello/Aaron_Project/extreme-parkour/legged_gym/legged_gym/envs/base/legged_robot.py", line 145, in step self.post_physics_step() File "/home/hello/Aaron_Project/extreme-parkour/legged_gym/legged_gym/envs/base/legged_robot.py", line 259, in post_physics_step self.reset_idx(env_ids) File "/home/hello/Aaron_Project/extreme-parkour/legged_gym/legged_gym/envs/base/legged_robot.py", line 327, in reset_idx self._reset_dofs(env_ids) File "/home/hello/Aaron_Project/extreme-parkour/legged_gym/legged_gym/envs/base/legged_robot.py", line 636, in _reset_dofs self.dof_vel[env_ids] = 0. RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1." i try to train base_policy,training is normal for the first few thousand rounds,but the error will occur every time i train for less than 10,000 rounds. Looking forward to your reply,thanks

geyang commented 2 months ago

@ChengEeee how did you fix this?