Is there a way to automatically save the best result along with the checkpoints?

Toni-SM / skrl

Modular reinforcement learning library (on PyTorch and JAX) with support for NVIDIA Isaac Gym, Omniverse Isaac Gym and Isaac Lab

https://skrl.readthedocs.io/

MIT License

518 stars 47 forks source link

Is there a way to automatically save the best result along with the checkpoints? #14

Closed famora2 closed 2 years ago

famora2 commented 2 years ago

Hi @Toni-SM ,

Screenshot from 2022-05-24 10-36-00

Above you can see the training result of FrankaReach task. As you can see the reward drops after a certain step. But as the best result from the beginning is not automatically saved, one has to check the tensorboard and select the appropriate checkpoint from the logs. Is there a way to automatically save the best result along with the checkpoints like in ISAAC GYM?

Toni-SM commented 2 years ago

Hi @Joo236

Yeah, this is a good idea... I will try to implement it for the next patch

Toni-SM commented 2 years ago

Hi @Joo236

Now, the best result (attending to the Mean Total Reward) is saved during the execution of the experiment... at least in the develop branch, at the moment.

https://github.com/Toni-SM/skrl/blob/cd45cf9d651c39ae4816145ff02f974825131410/skrl/agents/torch/base.py#L159-L164

famora2 commented 2 years ago

The code gives following error:

File "/home/user/skrl/skrl/agents/torch/base.py", line 163, in write_checkpoint
    state_dict=self.checkpoint_best_models["models"][k])
TypeError: save() got an unexpected keyword argument 'state_dict'

Toni-SM commented 2 years ago

Hi @famora2

Are you using the develop branch? Is it updated?

You can do it by executing the following commands inside the skrl repository's root directory

git checkout develop
git pull --all

Also, you can check if the save method is updated (skrl/models/torch/base.py file)

https://github.com/Toni-SM/skrl/blob/cd45cf9d651c39ae4816145ff02f974825131410/skrl/models/torch/base.py#L285-L294