Closed fearofcode closed 4 years ago
Thanks for the report. The data is downloaded as it is from Dukascopy endpoints using Python script without changing the original values, e.g.
./dl_bt_dukascopy.py -c -p EURUSD
Possibly they've changed the volume format in the recent year. Maybe they've decided that providing full volume values make more sense.
If there is some inconsistency in volumes, it needs to be fixed manually or in the script.
Which one do you suggest should be the base format? With in millions (with comma) or full values?
Yes, I think the change in question is this one: https://github.com/FX31337/FX-BT-Scripts/commit/d3e46bbc35beb16ed7222f8fef10b7cda10c7c0c#diff-e39272740e5f05eca41cf05889724b89R357-R358
I would keep the original data as is and not round it.
Ok, thanks for identifying the issue. I'll try to regenerate the files soon using original values.
Scripts has been fixed.
I've fixed data files in this repo manually.
Command for the reference:
find . -name "*.csv" -exec bash -c 'awk -F, '\''{print $1","$2","$3","($4/1000000)","($5/1000000)}'\'' {} > {}.new && mv {}.new {}' ';'
Binary files for MT platform has been re-generated (build: 698734474).
Tested with this build.
Thank you for creating this repository.
https://github.com/FX-Data/FX-Data-EURUSD-DS/blob/EURUSD-2008/EURUSD/2008/01/2008-01-01--01h_ticks.csv shows volume in what looks like millions.
The 2019 data, however, appears to be in flat units. https://github.com/FX-Data/FX-Data-EURUSD-DS/blob/EURUSD-2019/EURUSD/2019/01/2019-01-01--22h_ticks.csv
Are there any other inconsistencies to be aware of if you wanted to combine multiple years into a single, consistent dataset?