alex-petrenko / sample-factory

High throughput synchronous and asynchronous reinforcement learning
https://samplefactory.dev
MIT License
773 stars 106 forks source link

Exporting policy for on-robot deployment #284

Closed mihirk284 closed 5 months ago

mihirk284 commented 10 months ago

First of all, thanks @alex-petrenko for this amazing library.

I am working on interfacing this library with https://github.com/ntnu-arl/aerial_gym_simulator. I have managed to interface this to train an aerial robot and I want to deploy the trained policy on an onboard computer for real-world testing.

I find it a bit challenging to have a standalone trained model that can be wrapped around with a ROS node as most of the examples provided involve an assumption of a simulator or a simulated environment with the robot.

What would be the best approach you would suggest to have a trained model be interfaced easily in a standalone ROS node for feeding in observations and getting actions, while also being able to reset the RNN states at the beginning of a robot mission?

TIA.

alex-petrenko commented 10 months ago

There's nothing in SF2 ecosystem that would help you directly, you'd need to write your own ROS wrapper. The enjoy script (https://github.com/alex-petrenko/sample-factory/blob/master/sample_factory/enjoy.py) is a good starting point, it essentially executes the loop that you need to have in your robot deployment, only instead of consuming observations from simulator it would need to get it from a real sensory system and then send the actions to the robot.

If you can implement the node in Python, I can imagine this should be quite easy.

In our experiments with quadrotor swarms (https://github.com/Zhehui-Huang/quad-swarm-rl/) we had to implement the on-board controller in C because the on-board hardware only supports simple C program.

We wrote an exporter for simple MLP policies to a C-file with weights and a simple loop based implementation of the forward pass. It is actually not as hard as it seems. My co-authors Zhehui (owner of the swarm-rl repo) and Sumeet are better contacts for these issues! (tagging @Zhehui-Huang here :) )

I believe our C code should be available if you're interested in it!

mihirk284 commented 10 months ago

Thanks for the reply @alex-petrenko ! enjoy.py was my first thought for editing. I would love to take a look at the C exporter as well if it is available @Zhehui-Huang. By any chance, can the code that uses exported policy make use of a GPU onboard the robot? Thanks

Zhehui-Huang commented 10 months ago

Here is the script that you can translate pytorch model to C code. Currently, it only supports transfer MLP. https://github.com/Zhehui-Huang/quad-swarm-rl/blob/master/swarm_rl/sim2real/sim2real.py

mihirk284 commented 10 months ago

Thank you for the reference @Zhehui-Huang! I'll check it out :)

alex-petrenko commented 10 months ago

@mihirk284 regarding the on-board GPU, there shouldn't be anything Sample Factory-specific in terms of how to run the model on the drone. The policy is just a regular PyTorch policy. So anything that works for running any other PyTorch model on your robot should work here too!