Added hyperparameter tuning for RecurrentPPO

Hyperparameter tuning for RecurrentPPO was non-existent as hyperparams_opt.py did not accept 'ppo_lstm' as a valid argument

Description

Extended sample_ppo_params() to be called by sample_ppo_lstm_params(), trail is then updated with some lstm specific hyperparams, and "policy_kwargs" is updated
Added "tiny" to sample_ppo_params() to support smaller neural nets for the LSTM (solution 2 in issue #409)

closes #409
ReccurentPPO's hyperparameters can not be tuned by passing "ppo_lstm" to -optimize
[x] I have raised an issue to propose this change (required for new features and bug fixes)

[x] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
[ ] Documentation (update in the documentation)

[x] I've read the CONTRIBUTION guide (required)
[x] I have updated the changelog accordingly (required).
[ ] My change requires a change to the documentation.
[x] I have updated the tests accordingly (required for a bug fix or a new feature).
[ ] I have updated the documentation accordingly.
[x] I have reformatted the code using make format (required)
[x] I have checked the codestyle using make check-codestyle and make lint (required)
[x] I have ensured make pytest and make type both pass. (required)

Note: we are using a maximum length of 127 characters per line