Hyperparameter tuning for RecurrentPPO was non-existent as hyperparams_opt.py did not accept 'ppo_lstm' as a valid argument
Description
Extended sample_ppo_params() to be called by sample_ppo_lstm_params(), trail is then updated with some lstm specific hyperparams, and "policy_kwargs" is updated
Added "tiny" to sample_ppo_params() to support smaller neural nets for the LSTM (solution 2 in issue #409)
Motivation and Context
closes #409
ReccurentPPO's hyperparameters can not be tuned by passing "ppo_lstm" to -optimize
[x] I have raised an issue to propose this change (required for new features and bug fixes)
Types of changes
[x] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
updated changelog, let me know if anything else is required, I can add you as a contributor, seems a little late to create a branch (I will keep this in mind next time)
Hyperparameter tuning for RecurrentPPO was non-existent as
hyperparams_opt.py
did not accept'ppo_lstm'
as a valid argumentDescription
sample_ppo_params()
to be called bysample_ppo_lstm_params()
, trail is then updated with some lstm specific hyperparams, and"policy_kwargs"
is updated"tiny"
tosample_ppo_params()
to support smaller neural nets for the LSTM (solution 2 inissue #409
)Motivation and Context
"ppo_lstm"
to-optimize
Types of changes
Checklist:
make format
(required)make check-codestyle
andmake lint
(required)make pytest
andmake type
both pass. (required)Note: we are using a maximum length of 127 characters per line