Closed Kismuz closed 5 years ago
- 17.02.18: First results on applying guided policy search idea (GPS) to btgym setup: https://github.com/Kismuz/btgym/blob/master/examples/guided_a3c.ipynb
Documentation on GPS API: https://kismuz.github.io/btgym/btgym.research.gps.html
- tensorboard summaries are updated with additional renderings: actions distribution, value function and LSTM_state; presented in the same notebook.
20.07.2018: major update to package:
enchancements to agent architecture:
strategy: new convention for naming get_state
methods, see BaseStrategy
class for details;
multiply datafeeds and assets trading implemented in two flavors:
continious actions space via PortfolioEnv which is closely related to contionious portfolio optimisation problem setup;
description and docs:
examples:
beta
and still needs some improvement, esp. for broker order execution logic as well as
action sampling routine for continuous A3C (which is Dirichlet process by now);episode
rendering modes are temporally disabled;11.12.2018: updates and fixes:
17.11.2018: updates and fixes:
18.01.2019: updates:
data model classes are under active development to power model-based framework:
new data_feed iterator classes has been added to provide training framework with synthetic data generated by model mentioned above;
strategy_gen_6 data handling and pre-processing has been redesigned:
internal
and external
observation sub-spaces to be present and allows both be one-level nested
sub-spaces itself (was only true for external
); all declared sub-spaces got encoded by separate convolution encoders;syncro_runner
; by default it is enabled for test episodes;
vvv Scroll down for latest updates vvv
All of the above results in about 2x training speedup in terms of train iterations: stacked LSTM agents takes ~20M steps to converge on 6 month real data set (was: 40).