Toni-SM / skrl

Modular reinforcement learning library (on PyTorch and JAX) with support for NVIDIA Isaac Gym, Omniverse Isaac Gym and Isaac Lab
https://skrl.readthedocs.io/
MIT License
518 stars 47 forks source link

How to implement the curriculum learning using the existing data #43

Closed famora2 closed 1 year ago

famora2 commented 1 year ago

Hi,

I would like to implement the so-called curriculum learning using skrl, where I initialize the training with a pre-recorded data and gradually decrease the usage of this pre-recorded data. The part that I do not understand is the way the code is structured. Taking the "FrankaCabinet" as an example:


agent = PPO(models=models_ppo,
            memory=memory, 
            cfg=cfg_ppo, 
            observation_space=env.observation_space, 
            action_space=env.action_space,
            device=device)

# Configure and instantiate the RL trainer
cfg_trainer = {"timesteps": 24000, "headless": True}
trainer = SequentialTrainer(cfg=cfg_trainer, env=env, agents=agent)

# start training
trainer.train()

Above code is used to initialize the agent and start the training. Assuming I have the pre-recorded joint trajectory of Franka arm as Numpy array, I would like to overwrite action (which is the output of the agent) with this Numpy array to guide the robot arm towards the desired behavior. However, in this way, the whole training would be messed up, as the provided action is actually crap. So, by simply overwriting the action values, the pre-recorded numpy array can not be appropriately used.

Do you have advice/tips for this case?