maximotus / arl22-rl-stock-trading

2 stars 2 forks source link

arl22-rl-stock-trading

System Requirements

This project requires the Windows Operating System.

Installation

Step 1: Install Poetry

Use poetry as the package manager. Install it using e.g.:

pip install poetry

Step 2: Install dependencies with poetry

The install command reads the pyproject.toml file from the current project, resolves the dependencies, and installs them.

poetry install

Optinally, build the project:

poetry build

Step 3: Getting the data

To get the data for training and testing the agent, there are two possible ways. The first method is very simple and fast but it only allows one to train and test on the AAPL dataset. The second method requires more setup to fetch the data yourself, but it allows to use different kind of data and symbols.

1. Example data

The simplest way to test the project is to download the example data from the google drive:

The important files here are the train.csv and the test.csv, the original dataset and the scalars are not needed to start an experiment. Move test.csv and train.csv into /experiments/data/minmax or an arbitrary folder, but than the path for test and train in the .yaml files have to be changed accordingly.

2. Fetch data yourself

  1. Install and run MetaTrader5:

    • To install Metatrader5, install it from here. In order to fetch data please make sure your MetaTrader5 Terminal is running and has a registered and activated account with your broker. If everything worked out clicking on the top right user icon should show your user logged in like this:

  2. Log in or register at Admiral Markets

    • The package was tested with an Admiral Markets Investment Demo Account (Sign up with Admirals, then go to the Dashboard and ADD ACCOUNT for the Invest option)

  3. Register for Finnhub and generate an API-Key

    • You will also need an account for the Finnhub API. Add the finnhub API key to the .env file in /experiments/.env

After all those steps are done go to the root directory of the project execute the script:

poetry run python ./experiments/fetch_data.py

This script will generate two files called train.csv and test.csv, which paths have to be specified when using the .yaml files

Usage

Start an Experiment

To start an experiment, chose an experiment from /experiments/config. Specify a .yaml file with the --conf paramter when calling /experiments/main.py like this:

poetry run python ./experiments/main.py --conf <path to config>

for example chosing the experiment that only uses ohlc data in /experiments/config/ohlc use:

poetry run python ./experiments/main.py --conf ./experiments/config/ohlc/dqn/ex1.yaml

There are two types major modes on how to run the project. The first one is the trainingsmode, in which the project trains a new model depending on the specification that was given. The second mode is evaluation mode where an existing model is provided to apply on a new test-dataset.

Trainings Experiment

mode: train # (train / eval)
logger:
  level: 20 # CRITICAL = 50, ERROR = 40, WARNING = 30, INFO = 20, DEBUG = 10, NOTSET = 0
  format: "%(asctime)s - %(levelname)s - %(module)s - %(message)s"
experiment_path: ./results
test_policy: true # whether to test / apply the policy after training or not

# define gym environment that will be injected to the agent
gym_environment:
  enable_render: false # whether to render the policy application or not
  window_size: 30
  scale_reward: 1
  data:
    train_path: ./experiments/data/minmax/train.csv
    test_path: ./experiments/data/minmax/test.csv
    attributes: [ "time", "open", "close", "low", "high" ]

# define the rl agent (using the above gym environment)
agent:
  episodes: 100
  log_interval: 5
  sb_logger: ["stdout", "csv", "tensorboard"] # format options are: "stdout", "csv", "log", "tensorboard", "json"

  # define model with its specific parameters
  model:
    name: DQN # compare https://stable-baselines3.readthedocs.io/en/master/modules/dqn.html
    pretrained_path: # empty if start from scratch TODO implement
    policy: MlpPolicy
    device: cuda # (cuda / cpu / auto)
    verbose: 1 # 0 none, 1 training information, 2 debug
    learning_rate: 0.0001
    gamma: 0.99
    seed: null
    buffer_size: 1000000
    learning_starts: 50000
    batch_size: 32
    tau: 1.0
    train_freq: 4
    gradient_steps: 1
    exploration_fraction: 0.1
    exploration_initial_eps: 1.0
    exploration_final_eps: 0.05
    predict_deterministic: true # only for testing

Evaluation

mode: eval # (train / eval)
logger:
  level: 20 # CRITICAL = 50, ERROR = 40, WARNING = 30, INFO = 20, DEBUG = 10, NOTSET = 0
  format: "%(asctime)s - %(levelname)s - %(module)s - %(message)s"
experiment_path: ./results

# define gym environment that will be injected to the agent
gym_environment:
  enable_render: false # whether to render the policy application or not
  window_size: 30
  scale_reward: 1
  data:
    train_path: ./data/minmax/train.csv # to evaluate also on training set (required)
    test_path: ./data/minmax/test.csv
    attributes: [ "time", "open", "close", "low", "high" ]

# define the rl agent (using the above gym environment)
agent:
  # define model
  model:
    name: DQN
    pretrained_path: ./results/train/template-dqn/2022-09-09-15-04-41/model/episode-1.zip # empty if start from scratch
    device: auto
    predict_deterministic: true # only for testing

The remaining parameters are special to their respective algorithms, to find out more look at them at the stable baseline documentation: DQN, A2C, PPO

Results

The results will be in the specified folder by the .yaml file.

experiments
|
|
└─── results, ohlc, dqn, train, ex1, 2022-01-01-00-00-00
     |   ex1.yaml
     |   out.log
     |
     └─── model
     |    |    evaluations.npz
     |    | 
     |    └─── best
     |         |    best_model.zip
     |
     └─── stats
          |    progress.csv
          |    events.out.tfevents...
          |
          └─── test-env
          |    |    result.csv
          |    |    result_graph.png
          |
          └─── train-env
          |    |    result.csv
          |    |    result_graph.png