robfiras / loco-mujoco

Imitation learning benchmark focusing on complex locomotion tasks using MuJoCo.
MIT License
475 stars 38 forks source link

Unable to use CUDA for imitation learning examples #5

Closed TairanHe closed 4 months ago

TairanHe commented 7 months ago

Hi,

I am trying to run imitation learning demos in loco-mujoco/examples/imitation_learning/01_Gail/launcher.py with USE_CUDA=True, but I get the following error:

File "/home/tairanhe/workspace/humanIL/ls-iq/imitation_lib/utils/networks.py", line 66, in forward return (inputs - mean) / std RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

robfiras commented 7 months ago

Hi! Thanks for pointing this out. I will fix this issue in the next few days. However, note that there is only a very minor benefit when using cuda, as it is running TRPO without parallel environments under the hood. So, I don't suggest to use is with cuda.

In any case, I will come back to you once this is fixed.

TairanHe commented 7 months ago

Thanks! By the way, are there options to save checkpoints of trained imitation policies and replay the policies after training?

robfiras commented 7 months ago

Yes there are. In the launcher file of the experiment, you have to specify the parameter "n_epochs_save". We set it to 25. Then, .msh files including the agents will be saved. It is very easy to replay them afterwards. Here is an example:

from mushroom_rl.core import Core, Agent
from loco_mujoco import LocoEnv

env = LocoEnv.make("HumanoidMuscle.walk")

agent = Agent.load("./agents/agent_epoch_414_J_638.793385.msh")

core = Core(agent, env)

core.evaluate(n_episodes=10, render=True)

Thanks for pointing this out. I will include this example in the repo as well.

LifelongYuan commented 6 months ago

Thank you for the awesome work! The tensor type mixing bug can be fixed by changing: https://github.com/robfiras/loco-mujoco/blob/67b88da3a464abd09b8ba8cbde8941b5c7b0cf61/examples/imitation_learning/01_Gail/experiment.py#L66 https://github.com/robfiras/loco-mujoco/blob/67b88da3a464abd09b8ba8cbde8941b5c7b0cf61/examples/imitation_learning/02_Vail/experiment.py#L64

to Standardizer(use_cuda=use_cuda)