Kismuz / btgym

Scalable, event-driven, deep-learning-friendly backtesting library
https://kismuz.github.io/btgym/
GNU Lesser General Public License v3.0
981 stars 259 forks source link

How to separate the whole dataset into train and test datasets and perform back-test #72

Closed shisi-cc closed 5 years ago

shisi-cc commented 5 years ago

Hi, I appreciate your trading platform and currently perform my own reinforcement learning in btgym. It seems the training process is performed over the whole dataset. How can I separate the whole dataset into train and test datasets and perform back-test?

Kismuz commented 5 years ago

@shisi-cc, BTgymDataset class has built-in method for fixed train/test splitting by setting target_period kwarg:

domain = BTgymDataset(
    filename=filename,
    episode_duration={'days': 0, 'hours': 22, 'minutes': 0},
    time_gap={'days': 0, 'hours': 12},  # episode duration tolerance
    start_00=False,
    start_weekdays={0, 1, 2, 3, 4, 5, 6},
    parsing_params=parsing_params,
    target_period={'days': 1, 'hours': 0, 'minutes': 0},  # reserve 1 final day as test set
)

If you use built in trainer framework like A3C you can than set trainer class kwarg episode_train_test_cycle to, say: episode_train_test_cycle=(10, 1), which say each runner to perform 1 test episode after 10 train ones (default setting is episode_train_test_cycle=(1, 0), means 'no tests').

I you use your own training framework, you should implement train/test sampling routing yourself, see: #54 (esp. lower part; also about rolling_split feature).

Doc: https://kismuz.github.io/btgym/btgym.datafeed.html#btgym.datafeed.derivative.BTgymDataset

shisi-cc commented 5 years ago

Oh, I see. I will read the issue. Thank you very much!