Kismuz / btgym

Scalable, event-driven, deep-learning-friendly backtesting library
https://kismuz.github.io/btgym/
GNU Lesser General Public License v3.0
985 stars 260 forks source link

tick level data feed #112

Closed cainiaocome closed 4 years ago

cainiaocome commented 5 years ago

is it possible to feed tick level data (and some other features other than OHLCV) to btgym, or what should be modified to support tick level data?

tick data example

timestamp,open,high,low,close,volume
2019-03-19 21:00:00.403000+08:00,2610.0,2610.0,2610.0,2610.0,326
2019-03-19 21:00:00.902000+08:00,2610.0,2610.0,2610.0,2610.0,406
2019-03-19 21:00:01.400000+08:00,2609.0,2609.0,2609.0,2609.0,54
2019-03-19 21:00:01.876000+08:00,2608.0,2608.0,2608.0,2608.0,22
2019-03-19 21:00:02.403000+08:00,2608.0,2608.0,2608.0,2608.0,154
2019-03-19 21:00:02.882000+08:00,2609.0,2609.0,2609.0,2609.0,442
2019-03-19 21:00:03.390000+08:00,2608.0,2608.0,2608.0,2608.0,42
2019-03-19 21:00:03.894000+08:00,2609.0,2609.0,2609.0,2609.0,26
2019-03-19 21:00:04.399000+08:00,2607.0,2607.0,2607.0,2607.0,232
2019-03-19 21:00:04.887000+08:00,2606.0,2606.0,2606.0,2606.0,32

in above example data, i put same value at open, high, low, close, which is price at that tick, tick interval is 500 ms.

there may be some other feature columns, such as ask price, bid price, ask volume, bid volume, ma.

code here here is doing some calculation related to timeframe, and seems like only support minute timeframe.

Kismuz commented 5 years ago

@cainiaocome, indeed RL algorithm itself is agnostic to timeframes and later is used only for correct episode duration handling and proper backtarder startegy iteration. See discussion at #54 Personally from my experience - using tick data is ineffective as it results in too much computation on too noisy data. Some kind of windowed aggregation should be done as preprocessing step. It can be aggregation by elapsed world time (usual time bars) or traded volume threshold (volume bars) or number of events (say, ticks or LOB events) elapsed (event bars). For example, when talking about high-liquidity crypto asset (BTC USD at BITMEX) ~ 10 sec. time bars are ok to learn from.

cainiaocome commented 5 years ago

@cainiaocome, indeed RL algorithm itself is agnostic to timeframes and later is used only for correct episode duration handling and proper backtarder startegy iteration. See discussion at #54 Personally from my experience - using tick data is ineffective as it results in too much computation on too noisy data. Some kind of windowed aggregation should be done as preprocessing step. It can be aggregation by elapsed world time (usual time bars) or traded volume threshold (volume bars) or number of events (say, ticks or LOB events) elapsed (event bars). For example, when talking about high-liquidity crypto asset (BTC USD at BITMEX) ~ 10 sec. time bars are ok to learn from.

Can't agree that tick data is ineffective. Anyway i am trying deep reinforcement learning with high frequency trading. Already managed to feed tick data to bygym, i am out right now and i will post code here later.

Ray-0403 commented 5 years ago

@cainiaocome, hey, how about your test of btgym? Is that work fine with other RL framework?

Kismuz commented 4 years ago

Closed due to long inactivity period.