isaac-sim / IsaacGymEnvs

Isaac Gym Reinforcement Learning Environments
Other
1.89k stars 409 forks source link

Exporting trained policy for sim2real #112

Open carebare47 opened 1 year ago

carebare47 commented 1 year ago

Hi,

I'm having difficulties in trying to take the trained policy out of Isaac/rl_games to use on a real robot. We've seen a project where someone builds ROS functionality into an isaac environment, but due to performance/deployment complexities this doesn't really work for us.

We've tried saving the model by accessing the runner here with torch.save(runner.model, '/path') (after making a slight modification to torch_runner.py which returns the player from here instead of calling .run() on it)

This didn't work, as far as I can tell due to anonymous references created by lambda functions in the model/network factories. We could start replacing these, but when you get down to the actual network itself it is constructed from lambdas (e.g, here), and rebuilding that in a way that is serializable seems excessive.

Another approach could be to modify rl_games to collect an example input tensor(s) (e.g. here), perform a torch script trace with the example input data, and save the full model (weights and architecture) in that program. We've not been able to get this working yet, and even if we could this solution isn't ideal as we'd either have to maintain a fork of rl games or use patches to modify the rl_games code, just to let us save/export models. It would be great to be able to save the full model via the higher level interfaces.

A third solution could be to build rl_games into our real robot controllers, pass it the same config used to train the policy (to generate the same network architecture), and the load the checkpoint file saved by isaac/rl_games during training. Maybe this would be the most appropriate approach, although for some projects we are doing having to include this overhead isn't great.

Is there a nicer approach than these that I have missed? Is there a way to save the full model (architecture + state) from the higher level interface? Or more specifically, if you wanted to take this trained policy and deploy it outside of an isaac gym environment, how would you go about this?

Thanks, Tom

eferreirafilho commented 5 months ago

Have you managed to find a good way to achieve this?

Many thanks!

ronaldluc commented 5 months ago

We've successfully deloyed models on real robots by embedding rl_games in the ROS node inferencing the policy.

We had to also modify rl_games directly. It runs well on OrangePi5B cpu.

On Mon, Mar 18, 2024, 23:49 Edson Filho @.***> wrote:

Have you managed to find a good way to achieve this?

Many thanks!

— Reply to this email directly, view it on GitHub https://github.com/NVIDIA-Omniverse/IsaacGymEnvs/issues/112#issuecomment-2005193682, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABWEIP2WXGHTSGNPGPUDPGLYY5VQTAVCNFSM6AAAAAAUXWSMPSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBVGE4TGNRYGI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

YuuChennABALONE commented 3 months ago

How did you do it? Could you give me some advice?