FX31337 / FX-BT-Scripts

:page_facing_up: Useful scripts for backtesting.
MIT License
34 stars 39 forks source link

Add model generation for Control points and Open prices #89

Closed kenorb closed 5 years ago

kenorb commented 7 years ago

Extend convert_csv_to_mt.py script to have 3 modes of FXT conversion:

Specifying multiple modes by: -m 0,1,2 should be supported as well, so as result 3 files should be generated at one run. This -m param only applies to FXT format. For HST it should be ignored.

Filename

The generated FXT filename must have type name SSSSSSPP_M.fxt where:

Currently it's generated correctly for Every tick model (with 0), we need to add support for 1 and 2. If model is not specified, by default generate the Every tick model (0).

Resources

For more explanation about mode of modelling, see:

Steps

  1. In some folder, clone repo with CSV files:

    git clone --branch EURUSD-2014 --single-branch https://github.com/FX31337/FX-BT-Data-EURUSD-DS
  2. Combine CSV data from 2 days into single file:

    find FX*  \( -name "2014-02-02*" -o -name "2014-02-03*" \) -exec cat {} ';' | sort > ticks.csv

    or 4 days:

    find FX* \( -name "2014-02-02*" -o -name "2014-02-03*" -o -name "2014-02-04*" -o -name "2014-02-05*" \) -exec cat {} ';' | sort > ticks.csv
  3. Clone this repo:

    git clone https://github.com/FX-Data/FX-Data-EURUSD-DS
  4. Convert CSV file into FXT (for M1, M5 and M30 timeframe) or using Docker:

    ./convert_csv_to_mt.py -v -i ticks.csv -f fxt4 -t M1,M5,M30

    Similar to hst4 format if you need to for testing.

  5. Read the generated files by:

    ./convert_mt_to_csv.py -i EURUSD1_0.fxt -f fxt4 | less
  6. Now generate file with control points (-m 1 to be implemented):

    ./convert_csv_to_mt.py -v -i ticks.csv -f fxt4 -t M1,M5,M30 -m 1

    This should generate 3 files EURUSD1_1.fxt, EURUSD5_1.fxt, EURUSD30_1.fxt having only control point prices as described above.

Sample 1

M30

$ ./FX-BT-Scripts/convert_mt_to_csv.py -i EURUSD30_0.fxt -f fxt4 | head
2014.02.02 22:00:00,1.34842,1.34842,1.34842,1.34842,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34837,1.34837,1.34837,1.34837,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34828,1.34828,1.34828,1.34828,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34828,1.34828,1.34828,1.34828,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34832,1.34832,1.34832,1.34832,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34832,1.34832,1.34832,1.34832,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34827,1.34827,1.34827,1.34827,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34829,1.34829,1.34829,1.34829,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34835,1.34835,1.34835,1.34835,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34837,1.34837,1.34837,1.34837,1,2014.02.02 22:00:00,4
$ ./FX-BT-Scripts/convert_mt_to_csv.py -i EURUSD30_1.fxt -f fxt4 | head
2014.02.02 22:00:00,1.34842,1.34871,1.34820,1.34832,717,2014.02.02 22:29:59,0
2014.02.02 22:30:00,1.34843,1.34905,1.34818,1.34863,2299,2014.02.02 22:59:59,0
2014.02.02 23:00:00,1.34865,1.34912,1.34851,1.34871,2650,2014.02.02 23:29:59,0
2014.02.02 23:30:00,1.34867,1.34892,1.34843,1.34873,2124,2014.02.02 23:59:59,0
2014.02.03 00:00:00,1.34873,1.34878,1.34800,1.34865,3445,2014.02.03 00:29:59,0
2014.02.03 00:30:00,1.34866,1.34916,1.34836,1.34865,4396,2014.02.03 00:59:59,0
2014.02.03 01:00:00,1.34864,1.34874,1.34801,1.34855,4995,2014.02.03 01:29:59,0
2014.02.03 01:30:00,1.34855,1.34865,1.34820,1.34850,3583,2014.02.03 01:59:59,0
2014.02.03 02:00:00,1.34852,1.34853,1.34798,1.34830,3988,2014.02.03 02:29:59,0
2014.02.03 02:30:00,1.34830,1.34853,1.34809,1.34849,3820,2014.02.03 02:59:59,0
$ ./FX-BT-Scripts/convert_mt_to_csv.py -i EURUSD30_2.fxt -f fxt4 | head
2014.02.02 22:00:00,1.34842,1.34871,1.34820,1.34832,717,2014.02.02 22:29:59,0
2014.02.02 22:30:00,1.34843,1.34905,1.34818,1.34863,2299,2014.02.02 22:59:59,0
2014.02.02 23:00:00,1.34865,1.34912,1.34851,1.34871,2650,2014.02.02 23:29:59,0
2014.02.02 23:30:00,1.34867,1.34892,1.34843,1.34873,2124,2014.02.02 23:59:59,0
2014.02.03 00:00:00,1.34873,1.34878,1.34800,1.34865,3445,2014.02.03 00:29:59,0
2014.02.03 00:30:00,1.34866,1.34916,1.34836,1.34865,4396,2014.02.03 00:59:59,0
2014.02.03 01:00:00,1.34864,1.34874,1.34801,1.34855,4995,2014.02.03 01:29:59,0
2014.02.03 01:30:00,1.34855,1.34865,1.34820,1.34850,3583,2014.02.03 01:59:59,0
2014.02.03 02:00:00,1.34852,1.34853,1.34798,1.34830,3988,2014.02.03 02:29:59,0
2014.02.03 02:30:00,1.34830,1.34853,1.34809,1.34849,3820,2014.02.03 02:59:59,0

M5

$ ./convert_mt_to_csv.py -i EURUSD5_0.fxt -f fxt4 | head
2014.02.02 22:00:00,1.34842,1.34842,1.34842,1.34842,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34837,1.34837,1.34837,1.34837,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34828,1.34828,1.34828,1.34828,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34828,1.34828,1.34828,1.34828,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34832,1.34832,1.34832,1.34832,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34832,1.34832,1.34832,1.34832,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34827,1.34827,1.34827,1.34827,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34829,1.34829,1.34829,1.34829,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34835,1.34835,1.34835,1.34835,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34837,1.34837,1.34837,1.34837,1,2014.02.02 22:00:00,4
$ ./convert_mt_to_csv.py -i EURUSD5_1.fxt -f fxt4 | head
2014.02.02 22:00:00,1.34842,1.34842,1.34821,1.34821,60,2014.02.02 22:04:59,0
2014.02.02 22:05:00,1.34823,1.34845,1.34823,1.34833,65,2014.02.02 22:09:59,0
2014.02.02 22:10:00,1.34833,1.34855,1.34831,1.34850,142,2014.02.02 22:14:59,0
2014.02.02 22:15:00,1.34850,1.34865,1.34850,1.34865,90,2014.02.02 22:19:59,0
2014.02.02 22:20:00,1.34864,1.34866,1.34857,1.34864,67,2014.02.02 22:24:59,0
2014.02.02 22:25:00,1.34862,1.34871,1.34820,1.34832,291,2014.02.02 22:29:59,0
2014.02.02 22:30:00,1.34843,1.34863,1.34821,1.34860,595,2014.02.02 22:34:59,0
2014.02.02 22:35:00,1.34860,1.34867,1.34857,1.34862,276,2014.02.02 22:39:59,0
2014.02.02 22:40:00,1.34861,1.34867,1.34859,1.34862,270,2014.02.02 22:44:59,0
2014.02.02 22:45:00,1.34865,1.34895,1.34865,1.34889,234,2014.02.02 22:49:59,0
$ ./convert_mt_to_csv.py -i EURUSD5_2.fxt -f fxt4 | head
2014.02.02 22:00:00,1.34842,1.34842,1.34821,1.34821,60,2014.02.02 22:04:59,0
2014.02.02 22:05:00,1.34823,1.34845,1.34823,1.34833,65,2014.02.02 22:09:59,0
2014.02.02 22:10:00,1.34833,1.34855,1.34831,1.34850,142,2014.02.02 22:14:59,0
2014.02.02 22:15:00,1.34850,1.34865,1.34850,1.34865,90,2014.02.02 22:19:59,0
2014.02.02 22:20:00,1.34864,1.34866,1.34857,1.34864,67,2014.02.02 22:24:59,0
2014.02.02 22:25:00,1.34862,1.34871,1.34820,1.34832,291,2014.02.02 22:29:59,0
2014.02.02 22:30:00,1.34843,1.34863,1.34821,1.34860,595,2014.02.02 22:34:59,0
2014.02.02 22:35:00,1.34860,1.34867,1.34857,1.34862,276,2014.02.02 22:39:59,0
2014.02.02 22:40:00,1.34861,1.34867,1.34859,1.34862,270,2014.02.02 22:44:59,0
2014.02.02 22:45:00,1.34865,1.34895,1.34865,1.34889,234,2014.02.02 22:49:59,0

M1

$ ./convert_mt_to_csv.py -i EURUSD1_0.fxt -f fxt4 | head
2014.02.02 22:00:00,1.34842,1.34842,1.34842,1.34842,1,2014.02.02 22:00:00,4
2014.02.02 22:00:00,1.34837,1.34837,1.34837,1.34837,1,2014.02.02 22:00:03,4
2014.02.02 22:00:00,1.34828,1.34828,1.34828,1.34828,1,2014.02.02 22:00:04,4
2014.02.02 22:00:00,1.34828,1.34828,1.34828,1.34828,1,2014.02.02 22:00:09,4
2014.02.02 22:00:00,1.34832,1.34832,1.34832,1.34832,1,2014.02.02 22:00:09,4
2014.02.02 22:00:00,1.34832,1.34832,1.34832,1.34832,1,2014.02.02 22:00:10,4
2014.02.02 22:00:00,1.34827,1.34827,1.34827,1.34827,1,2014.02.02 22:00:10,4
2014.02.02 22:00:00,1.34829,1.34829,1.34829,1.34829,1,2014.02.02 22:00:12,4
2014.02.02 22:00:00,1.34835,1.34835,1.34835,1.34835,1,2014.02.02 22:00:14,4
2014.02.02 22:00:00,1.34837,1.34837,1.34837,1.34837,1,2014.02.02 22:00:15,4
 $ ./convert_mt_to_csv.py -i EURUSD1_1.fxt -f fxt4 | head
2014.02.02 22:00:00,1.34842,1.34842,1.34827,1.34833,19,2014.02.02 22:00:59,0
2014.02.02 22:01:11,1.34837,1.34842,1.34834,1.34839,23,2014.02.02 22:02:04,0
2014.02.02 22:02:05,1.34840,1.34841,1.34825,1.34827,11,2014.02.02 22:03:04,0
2014.02.02 22:03:11,1.34827,1.34832,1.34827,1.34827,2,2014.02.02 22:04:04,0
2014.02.02 22:04:05,1.34821,1.34821,1.34821,1.34821,3,2014.02.02 22:04:59,0
2014.02.02 22:05:00,1.34823,1.34845,1.34823,1.34845,16,2014.02.02 22:05:59,0
2014.02.02 22:06:26,1.34839,1.34840,1.34839,1.34840,6,2014.02.02 22:07:25,0
2014.02.02 22:07:27,1.34837,1.34841,1.34837,1.34841,8,2014.02.02 22:08:05,0
2014.02.02 22:08:06,1.34835,1.34835,1.34827,1.34827,26,2014.02.02 22:09:01,0
2014.02.02 22:09:02,1.34830,1.34833,1.34830,1.34833,8,2014.02.02 22:10:01,0
$ ./convert_mt_to_csv.py -i EURUSD1_2.fxt -f fxt4 | head
2014.02.02 22:00:00,1.34842,1.34842,1.34827,1.34833,19,2014.02.02 22:00:59,0
2014.02.02 22:01:11,1.34837,1.34842,1.34834,1.34839,23,2014.02.02 22:02:10,0
2014.02.02 22:02:05,1.34840,1.34841,1.34825,1.34827,11,2014.02.02 22:03:04,0
2014.02.02 22:03:11,1.34827,1.34832,1.34827,1.34827,2,2014.02.02 22:04:10,0
2014.02.02 22:04:05,1.34821,1.34821,1.34821,1.34821,3,2014.02.02 22:05:04,0
2014.02.02 22:05:00,1.34823,1.34845,1.34823,1.34845,16,2014.02.02 22:05:59,0
2014.02.02 22:06:26,1.34839,1.34840,1.34839,1.34840,6,2014.02.02 22:07:25,0
2014.02.02 22:07:27,1.34837,1.34841,1.34837,1.34841,8,2014.02.02 22:08:26,0
2014.02.02 22:08:06,1.34835,1.34835,1.34827,1.34827,26,2014.02.02 22:09:05,0
2014.02.02 22:09:02,1.34830,1.34833,1.34830,1.34833,8,2014.02.02 22:10:01,0

Check files from EURUSD-2014/EURUSD/2014/02 for more accurate CSV data to compare with.

The above FXT files has been uploaded below.

The above rows are in the following order: bar timestamp, open, high, low, close, volume, timestamp.

Sample files

  1. EURUSD_FXT_samples.zip
  2. EURUSD.ecn30_fxt_files.zip

These files can be read by convert_mt_to_csv.py script as shown above. Or visually you can open them by MT4 platform and open by File, Open Offline (files needs to be placed in tester/history folder of the platform dir.


See also related issue: #86


Est. 16h

kenorb commented 6 years ago

Basically, it should work the same as the MT4 platform does it by default when backtesting. It's got its own built-in converter, so we can compare the result data.

ghost commented 5 years ago

Hello again. I have been going through the data to understand how to solve this issue, but I have some questions.

First, what exactly is a price bar? From what I can see, it appears to be a collection of ticks for a given timeframe. The examples you provided appear to just take all the available ticks and group them within every 1, 5, or 30 minute timeframes that occur in the input ticks.

Second, do you have any examples of what the other models would generate for a given set of ticks? I am having some difficulty understanding the control points model in particular, but the open price one appears to be very simple. It sounds like it uses only the opening price to generate the tick for a given price bar.

Thanks.

Edit: I also notice that you say the "every tick" model is already implemented, but it does not appear that any interpolation is already performed. Is interpolation necessary for any of this?

kenorb commented 5 years ago
kenorb commented 5 years ago

"every tick" model is already implemented

That mean every tick in CSV is present in FXT (downside is slowness and a big file). When you convert to control points, you're filtering out the ticks which are less useful. So basically instead of having 100-200 ticks per minute, you've only 4 major ticks per minute (OHLC values). In open price mode, you've only open values (a single tick) per timeframe (e.g. M1).

kenorb commented 5 years ago

Let me know if anything needs further clarification.

ghost commented 5 years ago

I've been reviewing the open prices model, and it appears MT4 does some weird things. For some timeframes, it chooses to output one tick, on others it chooses to output two ticks. When generating two ticks, it appears to generate one for the start and another for the end of the timeframe.

Any ideas how I should handle this? I can't seem to find any correlations to explain why and when it chooses to act in this manner. I had thought the open price model might be as simple as just keeping the first tick of a timeframe and discarding the rest, but there's evidently more to it. In particular, it appears I am expected to modify the tick's volume to be the sum of all the tick volumes for the timeframe.

kenorb commented 5 years ago

Maybe MT4 is using data from 'the nearest less timeframe must be available'.

Please check this Support point section. But I'm not sure if this describes the logic of Control points mode. If it's not, or it's not clear, try to make it more logical, as far as MT4 won't generate any validation data errors during testing.

In my understanding Control mode should be done by keeping the ticks which are either higher highs (keep the next tick if it's higher than the previous high), lower lows (keep the next tick only if it's lower than the previous low). This way we can filter out useless ticks which doesn't provide any new low/highs. The main goal of Control mode is to filter out the useless ticks, and keeping the major changes of the price. So if that make sense, you can implement this way, even the method from MT4 differ. We can always compare the results later and find which method is more reliable.

And Open mode can be implemented by having only 4 ticks per timeframe (OHLC data), unless they're all the same. The volume should still match to avoid any data validation errors.

Let me know if that make sense.