Closed JaCoderX closed 5 years ago
@JacobHanouna, refer to #49, #46, #40, #23
@Kismuz I have read the issues you refereed to and tried to modify the 'unreal_stacked_lstm' example to see how the trained model behaves (on same example of course). but I'm struggling to make it work as I want.
I would say that my end goal here is to use Backtrader plot ability (Cerebro.plot()
) after the launcher finish an epoch over the test data.
based on your 'unreal_stacked_lstm' example this is what I've tried changing:
MyDataset
according to 'data_domain_api_intro' example. so I would have access to the target_period
paramsetting this param to non-zero duration forces separation to source/target domains (which can be thought of as creating top-level train/test subsets) with target data duration equal to
target_period
. Source data always precedes target one.
I changed also the data domain to 'BTgymCasualDataDomain' because it sound like it would make more sense to have the data being selected as a time sequence. so it look like this in the end:
MyDataset = BTgymCasualDataDomain(
filename='./data/DAT_ASCII_EURUSD_M1_2017.csv',
target_period={'days': 50, 'hours': 0, 'minutes': 0}, # use last 50 days of one year data as 'target domain'
# so we get [360 - holidays gaps - 50]
# days of train data (exclude holidays)
trial_params=dict(
start_weekdays={0, 1, 2, 3, 4, 5, 6},
sample_duration={'days': 30, 'hours': 0, 'minutes': 0}, # let each trial be 10 days long
start_00=True, # ajust trial beginning to the beginning of the day
time_gap={'days': 15, 'hours': 0}, # tolerance param
test_period={'days': 6, 'hours': 0, 'minutes': 0}, # from those 10 reserve last 2 days for trial test data
),
episode_params=dict(
start_weekdays={0, 1, 2, 3, 4, 5, 6},
sample_duration={'days': 0, 'hours': 23, 'minutes': 55}, # make every episode duration be 23:55
start_00=False, # do not ajust beginning time
time_gap={'days': 0, 'hours': 10},
)
)
trainer_config
to disable learning and replay, as followed:
trainer_config = dict(
class_ref=Unreal,
kwargs=dict(
opt_learn_rate=0,
# opt_learn_rate=[1e-4, 1e-4], # random log-uniform
# opt_end_learn_rate=1e-5,
# opt_decay_steps=50*10**6,
model_gamma=0.99,
model_gae_lambda=1.0,
model_beta=0.05, # entropy reg
rollout_length=20,
time_flat=True,
use_value_replay=False,
model_summary_freq=10,
episode_summary_freq=1,
env_render_freq=2,
)
)
When I run the script after modifications it seem that the launcher never finish working, just keeping the cycle of learning.
Again what I'm trying to achieve is to train the example model using the unreal example... done. Then use the train model on x last period (or even the whole data) and then see how well it preforms using Backtrader plotting which is very intuitive to read and to gain trading insights on the model behavior, using Backtrader Cerebro.plot() after the launcher finished running one time over the test data
OK I think I got how it works.
Once I found BTgymPlotter
class I followed it in the code.
The launcher can run for how many epochs it wants and it would just generate new Backtrader style summaries in tensorboard (under images) for each epoch.
I really enjoy this project @Kismuz very well designed :)
@JacobHanouna,
Train/test routine:
episode_train_test_cycle
to (0, 1) results in 'test, don't train' behaviour; if you want to get entire data range backtest as single episode - set episode duration to match entire dataset test range, also set time_gap ~ episode duration;
BTgymCasualDataDomain
because it requires explicit setting of inner global_time
variable and messing around with trials|episodes structure which is overkill if you not trying to implement some meta-learning algorithm.Environment rendering:
model_summary_freq=10,
episode_summary_freq=1,
env_render_freq=2,
env_render_freq
kwarg;episode_train_test_cycle=[0,1]
render_size_episode=(12,16),
render_dpi=75,
results in a bit bigger picture.
I am following the examples BTGym have and trained a simple model using the unreal_stacked_lstm example on EUR-USD data.
now for the purpose of learning, let's say the model looks good and I want to use it on new data and see how it interact with the environment, how do I do it? meaning how do I load the model and ask for an action prediction?