Closed des-zhong closed 4 months ago
I think we are using clamping by default. If you look inside of the yaml configs you can find: mu_activation: None Feel free to change it to the tanh. But during the training we still add noise with std and do clamp after.
thank you for your answer! it help me a lot. I still have a question tho I'm using isaacgym to train a robot. And after training, the best checkpoint is saved as a pth file. How can i load the pth file and deploy it to a real robot? I've searched for a while but still clueless Thank you
you can export it to the onnx. I have soem examples but without IsaacGym, probably the best way is try to do the same but with IG.
I've searched the repo but still clueless. Is it a tanh function at the end of actor network? thank you!