[Question] Accessing gradients during training

nairnan13 commented 2 years ago

Question

How can gradients be accessed during training?

Additional context

I currently use Stable Baselines 3 to train a reinforcement learning SAC agent, on a custom environment. I would like to plot the mean of all calculated gradients of the model while it is training. There are some unexpected drops in the agent's performance and I want to exclude the possibility of exploding gradients.

To gain some insight into the training of the model, I know you can create custom callbacks. Unfortunately, when I call the parameters with the function get_parameters() (in a custom callback, with function loop _on_rollout_end(), which should be called right before the update) there are no gradients attached to the parameters.

Is there another way to access the gradients? Or can I even exclude the possibility of exploding gradients?

Thanks for the help!

Checklist

[x] I have read the documentation (required)
[x] I have checked that there is no similar issue in the repo (required)

araffin commented 2 years ago

Hello, you should look into the W&B callback (it can log gradients, cf. doc), the relevant code is here: https://github.com/wandb/wandb/blob/9c777265f8cea1eaeb0407dd37ab889aeea81114/wandb/sdk/wandb_watch.py#L20

Otherwise, you can simply fork SB3 and add additional debug infos directly in the code.

nairnan13 commented 2 years ago

Hello, you should look into the W&B callback (it can log gradients, cf. doc), the relevant code is here: https://github.com/wandb/wandb/blob/9c777265f8cea1eaeb0407dd37ab889aeea81114/wandb/sdk/wandb_watch.py#L20

Otherwise, you can simply fork SB3 and add additional debug infos directly in the code.

Thanks for the reply! I completely forgot about W&B, this should solve it.

DLR-RM / stable-baselines3