DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
8.97k stars 1.68k forks source link

[Question] Accessing gradients during training #1044

Closed nairnan13 closed 2 years ago

nairnan13 commented 2 years ago

Question

How can gradients be accessed during training?

Additional context

I currently use Stable Baselines 3 to train a reinforcement learning SAC agent, on a custom environment. I would like to plot the mean of all calculated gradients of the model while it is training. There are some unexpected drops in the agent's performance and I want to exclude the possibility of exploding gradients.

To gain some insight into the training of the model, I know you can create custom callbacks. Unfortunately, when I call the parameters with the function get_parameters() (in a custom callback, with function loop _on_rollout_end(), which should be called right before the update) there are no gradients attached to the parameters.

Is there another way to access the gradients? Or can I even exclude the possibility of exploding gradients?

Thanks for the help!

Checklist

araffin commented 2 years ago

Hello, you should look into the W&B callback (it can log gradients, cf. doc), the relevant code is here: https://github.com/wandb/wandb/blob/9c777265f8cea1eaeb0407dd37ab889aeea81114/wandb/sdk/wandb_watch.py#L20

Otherwise, you can simply fork SB3 and add additional debug infos directly in the code.

nairnan13 commented 2 years ago

Hello, you should look into the W&B callback (it can log gradients, cf. doc), the relevant code is here: https://github.com/wandb/wandb/blob/9c777265f8cea1eaeb0407dd37ab889aeea81114/wandb/sdk/wandb_watch.py#L20

Otherwise, you can simply fork SB3 and add additional debug infos directly in the code.

Thanks for the reply! I completely forgot about W&B, this should solve it.