isaac-sim / IsaacLab

Unified framework for robot learning built on NVIDIA Isaac Sim
https://isaac-sim.github.io/IsaacLab
Other
2.18k stars 903 forks source link

[Question] Load an rl_games trained model for real robot deployment and control frequency setup #935

Closed dhruvkm2402 closed 2 months ago

dhruvkm2402 commented 2 months ago

Hello, I have a trained rl_games model and I want to load it to use it with ROS. The model does not function correctly at all which I can understand could be because of several reasons. However I wanted to confirm if the process loading the rl_games model trained in Isaac Lab could be different. Here is the code:

class LoadModel(nn.Module):
    def __init__(self):
        super(LoadModel, self).__init__()

        self.fc1 = nn.Linear(22, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 64)
        self.fc4 = nn.Linear(64, 7)

    def forward(self, x):
        x = torch.nn.ELU()(self.fc1(x))
        x = torch.nn.ELU()(self.fc2(x))
        x = torch.nn.ELU()(self.fc3(x))
        x = self.fc4(x)
        return x

And then we load it as

        self.model_path = args.model_path
        #torch.save(self.model.state_dict(), self.model_path)
        # print(self.model_path)
        self.model.load_state_dict(torch.load(self.model_path), strict=False)

We had to use strict=False because of missing keys error.

2) I have to send the control commands at 10 Hz and receive observations accordingly. I have kept dt=1/120 and decimation as 12, is that setup correct? I'd like to get quick help on it if possible since I'm aiming it to be a part of my research. Thank You

Mayankm96 commented 2 months ago

I don't think this is a IsaacLab specific issue.

RL-Games provides the checkpoint of the model, and you need to load this checkpoint into your deployment code ideally while considering real-time control interfaces for your robot. The way you're doing above seems okay. But it is better to check with RL-Games developers if the above is correct.

Usually, in my experience, most robots have a C++ control layer that is updated at the highest possible actuator frequency (for instance, 400Hz on ANYmal). We add the network inference in this control loop to make sure the policy is called at the exact frequency. Otherwise, real-time control can become difficult. Most groups make an ONNX model and load that policy for inferencing.

I haven't personally done real-time control through Python, so I can't comment on whether the "decimation" loop that you're doing will work effectively.