News and updates: Slack channel added

Kismuz commented 6 years ago

vvv Scroll down for latest updates vvv

6.02.18: Pushed common update to all a3c agents architectures:
- all dense layers are now Noisy-Net ones, see: Noisy Networks for Exploration paper by Fortunato at al.;
  - note that entropy regularization is still here, lowered to ~0.01 to ensure proper exploration; update your entropy_beta custom settings, if done any;
- policy output distribution is 'centered' using layer normalisation technique;
- added per-channel weights scaling to obs state before passing to tanh, see #35 and https://github.com/Kismuz/btgym/blob/master/examples/state_signal_scaling.ipynb

All of the above results in about 2x training speedup in terms of train iterations: stacked LSTM agents takes ~20M steps to converge on 6 month real data set (was: 40).

Kismuz commented 6 years ago

- 17.02.18: First results on applying guided policy search idea (GPS) to btgym setup: https://github.com/Kismuz/btgym/blob/master/examples/guided_a3c.ipynb

Documentation on GPS API: https://kismuz.github.io/btgym/btgym.research.gps.html

- tensorboard summaries are updated with additional renderings: actions distribution, value function and LSTM_state; presented in the same notebook.

Kismuz commented 6 years ago

20.07.2018: major update to package:
- enchancements to agent architecture:
  - casual convolution state encoder with attention for LSTM agent;
  - dropout regularization added;
- strategy: new convention for naming get_state methods, see BaseStrategy class for details;
- multiply datafeeds and assets trading implemented in two flavors:
  - discrete actions space via MultiDiscreteEnv class;
  - continious actions space via PortfolioEnv which is closely related to contionious portfolio optimisation problem setup;
    - description and docs:
      - MultiDataFeed: https://kismuz.github.io/btgym/btgym.datafeed.html#btgym.datafeed.multi.BTgymMultiData
      - ActionSpace: https://kismuz.github.io/btgym/btgym.html#btgym.spaces.ActionDictSpace
      - MultiDiscreteEnv: https://kismuz.github.io/btgym/btgym.envs.html#btgym.envs.multidiscrete.MultiDiscreteEnv
      - PortfolioEnv: https://kismuz.github.io/btgym/btgym.envs.html#btgym.envs.portfolio.PortfolioEnv
    - examples:
      - MultiDiscreteEnv: https://github.com/Kismuz/btgym/blob/master/examples/multi_discrete_setup_intro.ipynb
      - PortfolioEnv: https://github.com/Kismuz/btgym/blob/master/examples/portfolio_setup_BETA.ipynb
  - Notes on multi-asset setup:
    - adding these features forced substantial package redesign; expect bugs, some backward incompatibility, broken examples etc - please report;
    - current algorithms and agents architectures are ok with multiply data lines but seem not to cope well with multi-asset setup. It is especially evident in case of continuous actions, where agents completely fail to converge on train data;
    - current reward function design seems inappropriate; need to reshape;
    - continuous space in beta and still needs some improvement, esp. for broker order execution logic as well as action sampling routine for continuous A3C (which is Dirichlet process by now);
    - multi-discrete space is more consistent but severely limited in number of portfolio assets (but not data-lines) due to exponential rise of action space cardinality; the option is to as use many datalines as desired while limiting portfolio to 1 - 4 assets;
    - no Guided Policy available for multi-asset setup yet - in progress;
    - all but episode rendering modes are temporally disabled;
    - whole thing is shamelessly resource-hungry;