[Bug]: Missing default value for noise_type (for ddpg/td3) leads to unexpected behvaiours

🐛 Bug

Using TD3 as an exmaple, if the the noise_type is not specified for a custom environment in td3.yml. The following weird behavior happens:

The logic of deciding n_actions would be skipped and n_actions would remain None (in exp_manager.py). The value None will be further passed down to the Noise constructor, e.g: NormalActionNoise(mean=np.zeros(trial.n_actions), sigma=noise_std * np.ones(trial.n_actions))

Depending on the value of n_envs, the program would raise an error, or produce unintended result silently.

When n_envs > 1, an error would be raised for matrix shape mismatch.
When n_envs = 1, n_actions=None. No runtime error raised. Instead, the action noise will be one dim and broadcasted to the actual environment dimension, which will likely degrade the model performance silently (one of the most frustrating issues in ML).

The unintended behvaiour also depends on the actual environment action space, but you get the idea..

==========================

I think people expect that when a default param is not specified in td3.yml but present in the params sampler (e.g. sample_td3_params() in hyperparams_opt.py), the program will just use a sampled value and work as intended.

To Reproduce

python train.py --algo td3 --env "CustomEnv-v0" -optimize --n-trials 100 --sampler tpe --pruner median

Relevant log output / Error message

No response

System Info

No response

Checklist

[X] I have checked that there is no similar issue in the repo
[X] I have read the SB3 documentation
[X] I have read the RL Zoo README
[X] I have provided a minimal working example to reproduce the bug
[X] I've used the markdown code blocks for both code and stack traces.

DLR-RM / rl-baselines3-zoo