PKU-MARL / DexterousHands

This is a library that provides dual dexterous hand manipulation tasks through Isaac Gym
https://pku-marl.github.io/DexterousHands/
Apache License 2.0
613 stars 70 forks source link

Segmentation fault (core dumped) in Docker #7

Open JensenLZX opened 2 years ago

JensenLZX commented 2 years ago

Segmentation fault (core dumped) in Docker

Device: NVIDIA A100 40GB PCIe GPU Accelerator

Method: Docker

Details:

I run

python train.py --task=ShadowHandOver --algo=ppo

and

python train.py --task=ShadowHandOver --algo=happo

in ~\bi-dexhands

In both task the model weights xxx.pt had been saved in ~\bi-dexhands\logs correctly.

However, at the end of these tasks, it shows error in console as following.

Output:

some episodes done, average rewards:  tensor(16.7454, device='cuda:0')
some episodes done, average rewards:  tensor(14.1145, device='cuda:0')
some episodes done, average rewards:  tensor(15.4696, device='cuda:0')
some episodes done, average rewards:  tensor(15.4252, device='cuda:0')
some episodes done, average rewards:  tensor(14.8325, device='cuda:0')
some episodes done, average rewards:  tensor(19.7192, device='cuda:0')
some episodes done, average rewards:  tensor(15.9727, device='cuda:0')

Algo happo Exp check updates 48825/48828 episodes, total num timesteps 49997824/50000000, FPS 1922.

some episodes done, average rewards:  tensor(14.0804, device='cuda:0')
some episodes done, average rewards:  tensor(17.5084, device='cuda:0')
some episodes done, average rewards:  tensor(18.6891, device='cuda:0')
Segmentation fault (core dumped)

Is there any suggestion about dealing with this error?

Thx in advance!

cypypccpy commented 2 years ago

Dear @RogerLZX ,

I'm sorry that because we rarely use docker to run Isaac Gym, I don't know the reason for this bug. It looks like this bug only appears at the end of the task, so maybe you can increase the number of episodes to achieve the same effect.

Isaac Gym is still in development, so there will inevitably be many of these bugs. I recommend that you can go to the DevTalk Forum to find or ask about this bug, usually there will be NVIDIA developers to answer the questions if they know.

Hope this can help you.

JensenLZX commented 1 year ago

@cypypccpy Sorry~ This issue is duplicated with issue #8 by some mistakes. Please delete it.