Open methenol opened 6 years ago
Testing this modification to hypersearch.py, had to clear the runs database so it's going to be a bit before I can tell if it affected anything.
hypers['agent'] = {
# 'states_preprocessing': None,
# 'actions_exploration': None,
'actions_exploration.type':'ornstein_uhlenbeck',
'actions_exploration.sigma': {
'type': 'bounded',
'vals': [0., 1.],
'guess': .2,
'hydrate': min_threshold(.05, None)
},
'actions_exploration.mu':{
'type': 'bounded',
'vals': [0., 1.],
'guess': .2,
'hydrate': min_threshold(.05, None)
},
'actions_exploration.theta':{
'type': 'bounded',
'vals': [0., 1.],
'guess': .2,
'hydrate': min_threshold(.05, None)
},
# 'reward_preprocessing': None,
# I'm pretty sure we don't want to experiment any less than .99 for non-terminal reward-types (which are 1.0).
# .99^500 ~= .6%, so looses value sooner than makes sense for our trading horizon. A trade now could effect
# something 2-5k steps later. So .999 is more like it (5k steps ~= .6%)
'discount': 1., # {
# 'type': 'bounded',
# 'vals': [.9, .99],
# 'guess': .97
# },
}
First time tweaking the hypers, if there's a better way let me know.
UPDATE 08/14/18: The above code is not compatible with v0.2 as-is. The ranges to be searched are valid but the syntax is not compatible with the hyperopt implementation in v0.2.
This is able to run for v0.2: Would like for it to toggle on/off like the baseline section, working on that.
'actions_exploration': {
'type': 'ornstein_uhlenbeck',
'sigma': hp.quniform('exploration.sigma', 0, 1, 0.05),
'mu': hp.quniform('exploration.mu', 0, 1, 0.05),
'theta':hp.quniform('exploration.theta', 0, 1, 0.05)
},
Updated 08/19/18 to use use quniform
A brief explanation of the parmaters from here: https://www.maplesoft.com/support/help/maple/view.aspx?path=Finance%2FOrnsteinUhlenbeckProcess The parameter theta is the speed of mean-reversion. The parameter mu is the long-running mean. The parameter sigma is the volatility.
Feel free to add in a pull request, or even just commit to master if you feel confident about it
Going to try and get the values to a little more realistic first before submitting a PR for it. Letting the hypersearch run for a bit so it does it's thing.
I'm working outside of hypersearch right now so these are probably not ideal parameters. It seems the model becomes a little more flexible to less than perfect parameters (and the random associated with the model's initial state) with actions exploration defined.
https://reinforce.io/blog/introduction-to-tensorforce/ actions_exploration=dict( type='ornstein_uhlenbeck', sigma=0.1, mu=0.0, theta=0.1 ), these parameters are from the example in the above link and are not optimized
Any benefit to adding parameters for actions exploration to hypersearch?