[Feature Request] Reset parameter noise at environment reset.

Motivation

Unlike gSDE structured exploration, there is no way for layers with parametric noise NoisyLayer to resample the noise at environment reset. It can only be done after each optimization step, which is not following what is mentioned in the article and introduces bias in training batches.

Solution

For gSDE, there is a special transform specifically for this need gDSENoise, which is made possible because gSDEModel adds its parameters in the rollout TensorDict directly, which is not the case for NoisyLayer. It introduce issues when dealing with multiple workers in parallel since they all rely on the same policy. So I guess the only generic way to handle this situation is to add the action noise as part of the rollout TensorDict, much like what is done for gSDE.

Alternative

Add a hooking mechanism to run generic callbacks at reset of a BaseEnv.

Checklist

[x] I have checked that there is no similar issue in the repo (required)

pytorch / rl