FX-Data / FX-Data-EURUSD-DS

Forex Historical Data for EURUSD
https://travis-ci.org/FX-Data/FX-Data-EURUSD-DS
9 stars 5 forks source link

2019 uses different units for volume than other years/branches #5

Closed fearofcode closed 4 years ago

fearofcode commented 4 years ago

Thank you for creating this repository.

https://github.com/FX-Data/FX-Data-EURUSD-DS/blob/EURUSD-2008/EURUSD/2008/01/2008-01-01--01h_ticks.csv shows volume in what looks like millions.

The 2019 data, however, appears to be in flat units. https://github.com/FX-Data/FX-Data-EURUSD-DS/blob/EURUSD-2019/EURUSD/2019/01/2019-01-01--22h_ticks.csv

Are there any other inconsistencies to be aware of if you wanted to combine multiple years into a single, consistent dataset?

kenorb commented 4 years ago

Thanks for the report. The data is downloaded as it is from Dukascopy endpoints using Python script without changing the original values, e.g.

./dl_bt_dukascopy.py -c -p EURUSD

Possibly they've changed the volume format in the recent year. Maybe they've decided that providing full volume values make more sense.

If there is some inconsistency in volumes, it needs to be fixed manually or in the script.

kenorb commented 4 years ago

Which one do you suggest should be the base format? With in millions (with comma) or full values?

fearofcode commented 4 years ago

Yes, I think the change in question is this one: https://github.com/FX31337/FX-BT-Scripts/commit/d3e46bbc35beb16ed7222f8fef10b7cda10c7c0c#diff-e39272740e5f05eca41cf05889724b89R357-R358

I would keep the original data as is and not round it.

kenorb commented 4 years ago

Ok, thanks for identifying the issue. I'll try to regenerate the files soon using original values.

kenorb commented 4 years ago

Scripts has been fixed.

I've fixed data files in this repo manually.

Command for the reference:

find . -name "*.csv" -exec bash -c 'awk -F, '\''{print $1","$2","$3","($4/1000000)","($5/1000000)}'\'' {} > {}.new && mv {}.new {}' ';'

Binary files for MT platform has been re-generated (build: 698734474).

Tested with this build.