mjuchli / ctc-executioner

Master Thesis: Limit order placement with Reinforcement Learning
176 stars 83 forks source link

[RL] Extend feature set #4

Closed mjuchli closed 6 years ago

mjuchli commented 6 years ago

As a first step in order to extend the features to be used during the learning process, incorporate:

In a subsequent step, and in combination with #5 , the aim is to train on:

mjuchli commented 6 years ago

It has become a challenge on how to represent the mentioned features, as all of the underlying values are derived from a continuous time series and therefore will most likely not appear in the same constellation more than once during a relatively small training set. Hence, the capabilities of value iteration are exceeded. As a result, we have to find either an appropriate approximation of the feature values, or, represent states with a simple function (function approximation) of those features [1, 2]:

Volume: Input: sum(bid.vol + ask.vol) for previous n states Approximation: n*sum(bid.vol + ask.vol) for entire training set Output: [0.1, 0.2, ... , 1.0]

Fluctuation: Input: bestAsk for previous n states Approximation: bestAsk_s - bestAsk_s-1 Output: [-1, 0, 1]

Bids/Asks:


[1] https://danieltakeshi.github.io/2016/10/31/going-deeper-into-reinforcement-learning-understanding-q-learning-and-linear-function-approximation/
[2] http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/FA.pdf
mjuchli commented 6 years ago

Features are being described here: https://github.com/backender/ctc-executioner/commit/d3440b0bb81183b610811bc81e4d4e27d7ce771b