...Minimizing the mean square error on future experience. - Richard S. Sutton
Scalable event-driven RL-friendly backtesting library. Build on top of Backtrader with OpenAI Gym environment API.
Backtrader is open-source algorithmic trading library:
GitHub: http://github.com/mementum/backtrader
Documentation and community:
http://www.backtrader.com/
OpenAI Gym is...,
well, everyone knows Gym:
GitHub: http://github.com/openai/gym
Documentation and community:
https://gym.openai.com/
General purpose of this project is to provide gym-integrated framework for running reinforcement learning experiments in [close to] real world algorithmic trading environments.
DISCLAIMER:
Code presented here is research/development grade.
Can be unstable, buggy, poor performing and is subject to change.
Note that this package is neither out-of-the-box-moneymaker, nor it provides ready-to-converge RL solutions.
Think of it as framework for setting experiments with complex non-stationary stochastic environments.
As a research project BTGym in its current stage can hardly deliver easy end-user experience in as sense that
setting meaninfull experiments will require some practical programming experience as well as general knowledge
of reinforcement learning theory.
It is highly recommended to run BTGym in designated virtual environment.
Clone or copy btgym repository to local disk, cd to it and run: pip install -e .
to install package and all dependencies:
git clone https://github.com/Kismuz/btgym.git
cd btgym
pip install -e .
To update to latest version::
cd btgym
git pull
pip install --upgrade -e .
BTGym requres Matplotlib version 2.0.2, downgrade your installation if you have version 2.1:
pip install matplotlib==2.0.2
LSOF utility should be installed to your OS, which can not be the default case for some Linux distributives, see: https://en.wikipedia.org/wiki/Lsof
Making gym environment with all parmeters set to defaults is as simple as:
from btgym import BTgymEnv
MyEnvironment = BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',)
Adding more controls may look like:
from gym import spaces
from btgym import BTgymEnv
MyEnvironment = BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',
episode_duration={'days': 2, 'hours': 23, 'minutes': 55},
drawdown_call=50,
state_shape=dict(raw=spaces.Box(low=0,high=1,shape=(30,4))),
port=5555,
verbose=1,
)
Discrete actions setup: consider setup with one riskless asset acting as broker account cash and K (by default - one) risky assets.
For every risky asset there exists track of historic price records referred as data-line
.
Apart from assets data lines there [optionally] exists number of exogenous data lines holding some
information and statistics, e.g. economic indexes, encoded news, macroeconomic indicators, weather forecasts
etc. which are considered relevant to decision-making.
It is supposed for this setup that:
buy
, sell
, close
); short selling is permitted;The problem is modelled as discrete-time finite-horizon partially observable Markov decision process for equity/currency trading:
(0:
hold[do nothing], 1:
buy, 2:
sell, 3:
close[position])
;m
time-embedded preprocessed values for every data-line included and emits actions according some stochastic policy.Continuous actions setup[BETA]: this setup closely relates to continuous portfolio optimisation problem definition; it differs from setup above in:
a[i] in [0,1], 0<=i<=K, SUM{a[i]} = 1
for K
risky assets added;
each action is a market target order to adjust portfolio to get share a[i]*100%
for i
-th asset;{cash_name: a[0], asset_name_1: a[1], ..., asset_name_K: a[K]}
;For RL it implies having continuous action space as K+1
dim vector.
Notice: data shaping approach is under development, expect some changes. [7.01.18]
%matplotlib inline
magic
before btgym import. It's recommended to import btacktrader and btgym first to ensure proper backend
choice;Signalprime
contribution, 10.01.2019:
Signalprime
,
(https://github.com/signalprime), see btgym/docker/README.md
for details;9.02.2019:
25.01.2019: updates:
internal
and external
observation sub-spaces to be present and allows both be one-level nested
sub-spaces itself (was only true for external
); all declared sub-spaces got encoded by separate convolution encoders;syncro_runner
;
by default it is enabled for test episodes;pd.dataframes
as historic data dource via dataframe
kwarg (was: .csv
files only);18.01.2019: updates:
11.12.2018: updates and fixes:
btgym.reserach
17.11.2018: updates and fixes:
30.10.2018: updates and fixes:
btgym/datafeed/synthetic/ou.py
and btgym/research/ou_params_space_eval
for details;14.10.2018: update:
20.07.2018: major update to package:
enchancements to agent architecture:
base strategy update: new convention for naming get_state
methods, see BaseStrategy
class for details;
multiply datafeeds and assets trading implemented in two flavors:
continious actions space via PortfolioEnv which is closely related to contionious portfolio optimisation problem setup;
description and docs:
examples:
beta
and still needs some improvement, esp. for broker order execution logic as well as
action sampling routine for continuous A3C (which is Dirichlet process by now);episode
rendering modes are temporally disabled;17.02.18: First results on applying guided policy search ideas (GPS) to btgym setup can be seen here.
6.02.18: Common update to all a3c agents architectures:
policy output distribution is 'centered' using layer normalisation technique;
20.01.18: Project Wiki pages added;
12.01.18: Minor fixes to logging, enabled BTgymDataset train/test data split. AAC framework train/test cycle enabled
via
episode_train_test_cycle
kwarg.
7.01.18: Update:
Domain -> Trial -> Episode
sampling routine implemented. For motivation and
formal definitions refer to
Section 1.Data of this DRAFT,
API Documentation
and Intro example. Changes should be backward compatible.
In brief, it is necessry framework for upcoming meta-learning algorithms.logbook
module. Should eliminate errors under Windows.5.12.17: Inner btgym comm. fixes >> speedup ~5%.
02.12.17: Basic sliding time-window train/test
framework implemented via
BTgymSequentialTrial()
class. UPD: replaced by BTgymSequentialDataDomain
class.
29.11.17: Basic meta-learning RL^2 functionality implemented.
24.11.17: A3C/UNREAL finally adapted to work with BTGym environments.
/research/DevStartegy_4_6
;14.11.17: BaseAAC framework refraction; added per worker batch-training option and LSTM time_flatten option; Atari examples updated; see Documentation for details.
30.10.17: Major update, some backward incompatibility:
27.09.17: A3C test_4.2 added:
22.09.17: A3C test_4 added:
20.09.17: A3C optimised sine-wave test added here.
31.08.17: Basic implementation of A3C algorithm is done and moved inside BTgym package.
examples/a3c
directory.23.08.17: filename
arg in environment/dataset specification now can be list of csv files.
21.08.17: UPDATE: BTgym is now using multi-modal observation space.
DictSpace(gym.Space)
- dictionary (not nested yet) of core gym spaces.btgym/spaces.py
.raw_state
is default Box space of OHLC prices. Subclass BTgymStrategy and override get_state()
method to
compute alll parts of env. observation.07.08.17: BTgym is now optimized for asynchronous operation with multiply environment instances.
async_btgym_workers.ipynb
in examples
directory.15.07.17: UPDATE, BACKWARD INCOMPATIBILITY: now state observation can be tensor of any rank.
render_agent_channel=0
;11.07.17: Rendering battle continues: improved stability while low in memory,
added environment kwarg render_enabled=True
; when set to False
5.07.17: Tensorboard monitoring wrapper added; pyplot memory leak fixed.
30.06.17: EXAMPLES updated with 'Setting up: full throttle' how-to.
29.06.17: UPGRADE: be sure to run pip install --upgrade -e .
human
, agent
, episode
;
render process now performed by server and returned to environment as rgb numpy array
.
Pictures can be shown either via matplolib or as pillow.Image(preferred).raw_state
- price data,
and state
- featurized representation. get_raw_state()
method added to strategy.matplotlib
and pillow
.25.06.17: Basic rendering implemented.
23.06.17: alpha 0.0.4: added skip-frame feature, redefined parameters inheritance logic, refined overall stability;
17.06.17: first working alpha v0.0.2.