Switch to normal distribution for infinite action spaces

Previous, the sampling failed if the action space is infinite. While an infinite action space might be bad practise, in some cases it is not easy to get rid of them.

This commit checks if any limit for the action space is infinite. If so, it will use a normal distribution for sampling. Currently, the mean/mu/loc is set to all zeros and the scale/var to all ones. In the future, a more flexible definition of those two parameters might be useful.

Finally, using all lows and highs of the action space will create a uniform distribution with the correct limits per dimension/action.

This commit has been manually tested with Pendulum-v1 and LunarLanderContinuous-v2 using the Pendulum (DDPG) example.

Toni-SM / skrl

Switch to normal distribution for infinite action spaces #23