Kismuz / btgym

Scalable, event-driven, deep-learning-friendly backtesting library
https://kismuz.github.io/btgym/
GNU Lesser General Public License v3.0
988 stars 261 forks source link

Important: BTGym_2.0 announce and discussion #98

Closed Kismuz closed 4 years ago

Kismuz commented 5 years ago

Dear colleagues, As the author and maintainer of BTGym, which has been specially designed to integrate machine learning models with trading strategies, I witness a clear absence of algorithmic trading pipelines featuring proper support of modern artificial intelligence methods. With research field actively expanding, most of the promising results rather remain “proof of concept”, while live trading is still dominated by convenience models and engineered strategies.

To the best of my knowledge, BTgym is one the very few open source backtest systems providing parallelised event-driven execution and out-of-the-box integrated training framework for modern reinforcement learning algorithms.

Still, it is evident that conventional parallelised execution of single-threaded backtests poses a major limitations to model training speeds. Aside from that, integrated workflows from experiments to robust live serving of trained decision-making models are yet to be implemented.

With this in mind, I sketched out a view of a software ecosystem to facilitate diffusion of artificial intelligence methods to algorithmic trading space. After almost two years of developing and maintaining current BTGym project with all ups and downs of a solo open-source GitHub survival, I would like to gather as much expertise as possible before stepping into the next big round.
With this message I kindly ask you to join the discussion and express your professional judgment, expectations and proposals to shape a clear view of next generation of BTGym. You will find links to related resources below. Your input is valuable and highly appreciated.

Yours, Andrew Muzikin, BTGym author

BTGym_2.0 WhitePaper Draft

Short Note: “Why we need new backtest frameworks…”

mysl commented 5 years ago

@Kismuz Thank you so much for sharing this, and I have learnt a lot from btgym. Just my two cents about 2.0:

1) w.r.t decision-making agents part, it might provide long term advantage for porting it to running top of ray/rllib framework, so that btgym could inherit the benefit of high quality modular SOTA rl algorithm implementation, scalable training from local to cloud, and hyperparameter tuning framework support. And I guess it is still flexible enough for extension to satisfy the need of continuous research. 2) w.r.t backtesting framework/live execution, I feel btgym should decouple TradeTask and TradeEnv. TradeEnv defines an interface to low level backtest/live trade env, and TradeTask is higher level logic built on top TradeEnv interface, which is agnostic to low level implementation. In this way, backtrader is decoupled, and plugin more efficient backtesting framework (e.g. support order book level backtest, or parallel execution) is possible 3) adding tools for supporting fine-grained inspection so that we could better understand and interpret agent's state and behavior(maybe in real time). e.g. human could watch the market and judge the decision of the agent like human can watch and inspect the agent when it's playing the atari game

Kismuz commented 5 years ago

@mysl, thank you for response,

  1. yes indeed, an integrated framework is a must for proper model development as well for seamless future production usage (live trading) and is one driving reason for 2.0.

I'm currently considering two major alternatives: Tensorflow Extended and Ray|RLLIB:

JaCoderX commented 5 years ago

@Kismuz I'm joining @mysl support of this project. I think the work you are doing is absolutely phenomenal and very valuable for both the AI and economics communities. The white paper is a great road map of what had been already accomplished and the future possibilities this project can extend into.

I feel btgym should decouple TradeTask and TradeEnv.

The field of AI is advancing in an outstanding rate, new ideas and techniques are emerging fast. I share @mysl thoughts and more over, to keep this awesome project a milestone in the field, Modularity is very important. As more advancement in the field is made, integration to existing libraries will become more important. This way the focus can remain on experimentation and less on implementation.

A bit off topic but relevant to Arbitrage in crypto: As for live trading in the crypto space, CCXT is the leading open project for unifying crypto exchanges. some initial work had been made to create a bridge between Backtrader and CCXT. I extended it a bit to allow working with multi-broker setup needed for arbitrage trading. https://community.backtrader.com/topic/1165/does-backtrader-support-multiple-brokers/13 maybe it can be used for later live support

ALevitskyy commented 5 years ago

Thanks for all the work you put into making such an awesome library, it introduced me to Deep RL. The idea of using GANs or Hidden Markov Models as model-based part sounds very interesting and promising, and may be useful in other domains such as financial risk management which strongly relies on outdated generative methodologies (like assumption of Geometric Brownian motion or Gaussian Copula on multivariate data).
The 2 things I would really want to see in the second generation:

mysl commented 5 years ago

@Kismuz another two comments are 1) I find that the whiter paper doesn't mention Hierarchical RL. IMHO, this approach seems naturally fitting the trading domain. At least one application is jointly optimizing high level trading decision and low level execution. And it might be useful in integrating information from different timescales 2) Since risk is such a central theme in finance, risk sensitive /aware RL seems another direction which could provide extra benefit. I don't have much expertise in this area though, appreciate your comments

Kismuz commented 5 years ago

Based on issues analysis, there are top sources of confusion with present version:

  1. Market data input pipeline:
    • requests for different input formats, specifications and parsing presets, different data modalities (e.g. news). Should be definitely improved and unified with upcoming live trading support.
    • no clear abstract separation has been made between incoming data as environment dynamics driving force (assets data used by broker to estimate orders, account values etc) and [partially same] data as the source of environment observation (information stream).
    • observation data preprocessing module is not shaped. One can make all preprocessing lifting as part of computational tf graph. From the other hand, prototyping preprocessing and featurisation inside strategy is far more flexible but somewhat unstructured. Some Feature Store module should be explicitly instantiated.
    • same is true for reward function estimation, which is in some way true "feature engineering of RL".
  2. Clear specifications and action space presets. While any type of orders (market orders, stop losses, synthetic asset orders like ‘spread’) can be straightforwardly implemented with current BTGym and backtrader functionality, it will either instantly move us to hi-dimensional or continuous action space or just create structured confusion. Pre-defined abstraction containers like ‘Action Store’ should be implemented.
  3. Live trading API requests. My hesitation to add live simple trading functionality is rooted in an overall experience in deep learning models: their reproducibility and stability. Adding live API is not enough - it is essential to provide solution for what is referred to as ‘model serving’. This block is entirely missing for now.
  4. Hyperparameter search issues. Simply missing. Required.
Kismuz commented 5 years ago

@Kismuz another two comments are

  1. I find that the whiter paper doesn't mention Hierarchical RL. IMHO, this approach seems naturally fitting the trading domain. At least one application is jointly optimizing high level trading decision and low level execution. And it might be useful in integrating information from different timescales
  2. Since risk is such a central theme in finance, risk sensitive /aware RL seems another direction which could provide extra benefit. I don't have much expertise in this area though, appreciate your comments

@mysl, thanks, will include

Kismuz commented 5 years ago

TradeTask and TradeEnv

Do I understand correctly, we are talking of an intermediate abstraction layer to wrap any backtesting or live trading process as reinforcement learning task?

mysl commented 5 years ago

Do I understand correctly, we are talking of an intermediate abstraction layer to wrap any backtesting or live trading process as reinforcement learning task?

Sort of. I brought this up mainly from software design's perspective, which aims at separation of responsibility, reuse and extensibility. w.r.t TradeEnv, I was talking about abstract out a minimal API interface which wraps the functionalities needed from external dependencies (live or backtest libraries), i.e. data and broker provider. It could be viewed as defining a spec w.r.t TradeTask, it implements the interface needed as a RL environment, which interacts with the RL agent, provides observation, reward feedback, observation preprocessing, executes task etc. And it calls the interface of TradeEnv internally. It should be agnostic to TradeEnv implementation With this decoupling, we only need to implement TradeEnvBt to plug backtrader into the pipeline, and we can plugin other backtest engine by implementing the TradeEnv interface

Kismuz commented 5 years ago

With this decoupling, we only need to implement TradeEnvBt to plug backtrader into the pipeline, and we can plugin other backtest engine by implementing the TradeEnv interface

that's exactly what I'm thinking of

Kismuz commented 5 years ago

current btgym structure

This is how distributed training is currently organised. One of my goals is modify it to natively fit Ray setup. Everything inside blue area fits almost seamlessly. Possibly removing DataServer Process (use Ray Plasma Object Store instead?) and encapsulating BTGym Server functionality inside workers processes. Or, fully redefine BTgymServer as abstract environment driver process and let Ray manage it. Any thoughts on this?

mysl commented 5 years ago

my 2 cents:

  1. the actor (bt.strategy in the graph) and policy (tf.policy_local) should be coupled with a more generic interface instead of gym API. Logically, it's just a simple Agent interface with two methods a) actor query action from policy b) actor reports experience to policy trainer. Besides the interaction is more natural to be driven by the actor/env, especially in the trading task. which could be viewed as push based, while gym API is a pull based API. If we want to run gym task, then a gym API could be adapted to this design easily. I think tensorforce is using this approach

    Another benefit of this design is that once actor and policy/trainer communicate with this interface, it can be easily configured as in-process or interprocess

  2. agree removing data server, since data access and sampling can be easily integrated into the actor

Kismuz commented 5 years ago

More detailed view of current implementation of worker process: worker_process

MorleyMinde commented 5 years ago

Just to get some clarification. Is there a plan to move btgym to using continuous action space to accommodate stop losses and other dimensions.

I would really like to use these other dimensions and I cant seem to be able to get this anywhere. It could even be provided as a configuration where someone can choose to use the discrete action spaces or the continuous ones

mysl commented 5 years ago

google released its TF-Agents library for TF2.0 recently, I personally think it's very well designed. It might be still in early stage and lack of some features, but it has them in its road map, and since it's officially announced, should be able to get long term support from the big guy

https://github.com/tensorflow/agents https://www.youtube.com/watch?v=-TTziY7EmUA

ALevitskyy commented 5 years ago

I also heard on ods.ai about this library https://github.com/catalyst-team/catalyst, which won 3rd place on NeurIPS https://arxiv.org/abs/1903.00027. It implements SAC, TD3 and DDPG, and allows to train all 3 algorithm at the same time on the same reward experience. Also for nubes like me, who are just starting with Deep RL, I recently found this project by Open AI quite fascinating: https://spinningup.openai.com/en/latest/, and am slowly working through the key papers. I liked Rainbow, TD3 papers (hard algorithms with lots of tricks but seem to outperform all older algorithms on discrete and continuous actions respectively) and particularly this one https://arxiv.org/abs/1809.01999, I think they should make a good repertoire in case I ever manage to get to a serious level. The last paper is very interesting and may work quite well with the problem btgym dealing with. It basically achieved SOTA results on car-driving and Doom environments, beating other algorithms by a good margin. In short summary, it uses random agent to collect rollouts, then uses variational autoencoder (VAE) to encode images into low-dimensional space, and then trains LSTM on the latent representation. All of these steps does not require any rewards and interaction with the environment once data is collected. After that it takes hidden layer of LSTM and latent features, stacks them, and trains linear agent (almost like logistic regression) using evolutionary algorithm, which only requires seeing the final reward. In such a way very complex relationships between actions and reward may be discovered, easing the problem of attributing rewards to action, which in algorithmic trading I think is quite acute. It also provides a good generative model (if ignore the last evolutionary layer) which is by itself would be a useful thing in financial research. It may be an interesting algorithm to try out (replacing Conv2D to Conv1D in original architecture), and does not require any modification to btgym (particularly GPU support not that hard to implement for VAE and LSTM), as most of the stuff does not require interaction with environment during training, while evolutionary algorithm trains fast on CPU.

Kismuz commented 5 years ago

@MorleyMinde,

Just to get some clarification. Is there a plan to move btgym to using continuous action space to accommodate stop losses and other dimensions.

I would really like to use these other dimensions and I cant seem to be able to get this anywhere. It could even be provided as a configuration where someone can choose to use the discrete action spaces or the continuous ones

yes, enabling limit orders as preconfigured action space is one of essential enhancements. see also: https://github.com/Kismuz/btgym/blob/master/examples/portfolio_setup_BETA.ipynb

tagomatech commented 5 years ago

Hello. Thank you for BTGym.

Best

Kismuz commented 5 years ago

@tagomatech,

BTGym 2.0 is? will be? a brand new project, or some refactoring of..

will be a new programmatic implementation of same idea of casting trading tasks as MDP environments;

Compatibility between the 2 environment?

don't think it can be fully backward compatible with btgym1; yes at level of api;

use tf-agents as the new engine to BTGym 2.0?

focus of 2.0 ver. will shift towards specification of environment itself, not on developing agents for it; proper support and connection with existing agents and framework implementations like rllib, dopamin etc. is also a priority.

developeralgo8888 commented 5 years ago

Will BTGym 2.0 correspond to Tensorflow 2.0 because version 2.0 stable version should be out soon . Also i have been doing some AI research on my own , also looked at your work in detail and i think , if you continue on the same path then use

  1. Use BTGym + Stable-Baselines (Baselines) + Horovod (Easier to integrate with TF) or Ray with Tensorflow Backend

  2. if you are going to extend it to cover other backends like MXNet plus TF , then a good example is Intel (Nervana) RL Coach . The layout and implementation is very solid but i am not sure how easy it would be to integrate into BTGym . It has all the latest state of the art Agents and algorithms so you wont have to rewrite most of them .

NOTE: The beauty of Ray is that

it comes with libraries that accelerate deep learning and reinforcement learning development:

Tune: Hyperparameter Optimization Framework RLlib: Scalable Reinforcement Learning Distributed Training

but i don't have much experience using RLlib and i don't know if it has all the latest state of the art RL algorithms and agents since RL research is moving at an extremely fast pace.

NOTE : Reading your BTGym 2.0 Proposal i see that MOST of the algorithms , agents and networks /architectures are already implemented in either Stable-baselines, Baselines, Ray(RLLib), Intel Coach, Dopamine, TF-Agents or Garage (formerly OpenAI RLLab), or the old DeepMind TRFL . Also all this have Tensorflow 1.1x as the Backend . There is going to be Major design change or shift with TF 2.0 so we should work with our eyes towards that goal otherwise we don't want to duplicate the work

👍

Kismuz commented 5 years ago

@developeralgo8888 , as deeper I get in current industry and production state of algorithmic trading domain, it becomes evident to me that focus should be shifted toward clear, modular and abstract specification of trading tasks as environments and correct processing of underlying streams of financial data. While everyone is building algorithms and agents, correct (from ML pov) handling of financial data, especially high-dim Level II data is still an issue that IMHO really blocks correct implementation of state-of-the-art ML and RL algorithms.

developeralgo8888 commented 5 years ago

Hey Kismuz, i agree with you . It has to be modular . Ray and perhaps Intel Coach are the libraries that are very modular that can be used. What is needed is the financial data ingestion and processing modules for Level 1 and 2 stock data , Forex and Bitcoin plus live trading modules that work. It has to be complete end-to-end RL trading pipieline . Most of the other state of the art modules or pieces like agents, algorithms & architectures are already available in various libraries and github repos as a starting point.

developeralgo8888 commented 5 years ago

current btgym structure

This is how distributed training is currently organised. One of my goals is modify it to natively fit Ray setup. Everything inside blue area fits almost seamlessly. Possibly removing DataServer Process (use Ray Plasma Object Store instead?) and encapsulating BTGym Server functionality inside workers processes. Or, fully redefine BTgymServer as abstract environment driver process and let Ray manage it. Any thoughts on this?

IMHO , Ray is an excellent RL Library , if you can natively and seemlessly let RAY handle most of the nuts and bolts that it does so well and then we can focus on the live data streaming ingestion , processing of live , historical and synthetic data , plus live trading modules that actually work

developeralgo8888 commented 5 years ago

Hey Kismuz, Have you already written the code for BTGym 2.0 that you can publish ? i am thinking of making a modular version of improved BTGym version but completely based on TF, Ray and modin and a few sprinkles of stable-baselines code but following the Modular architecture of Ray and Intel Coach patterns

Thanks for great work

Ray-0403 commented 5 years ago

waiting for the release of this great project🍾

Kismuz commented 5 years ago

@developeralgo8888 , @Ray-0403 and everybody, First, I want to apologize for the absence of proper BTGym issue support for past three months. Currently I"m involved in full-time ML research for an algo-trading domain. This fact leaves virtually no time for timely project support and development. From the other hand, thanks to my current job, my vision of a future RL-trading framework has undergone substantial change toward more realistic production-level system. Is it turned out, when financial domain is considered, transfer to production phase (live trading) is quite complex (to say at least) even for relatively simple models (like classificators). This involves proper historic and live data handling and feature extraction, continuous model retraining add deploying control etc. So no, there is no new code since I only recently started to figure out principal approaches to handling such complicated dataflows etc.

kivo360 commented 5 years ago

Does this mean you'll completely stop? It was gaining steam.

JaCoderX commented 5 years ago

@Kismuz, Is there a way to help out/contribute on the new project?

Kismuz commented 4 years ago

Closed due to long inactivity period.