FX31337 / FX-BT-Scripts

:page_facing_up: Useful scripts for backtesting.
MIT License
34 stars 39 forks source link

convert_csv_to_mt.py: TypeError: 'NoneType' object is not subscriptable [$20] #25

Closed kenorb closed 8 years ago

kenorb commented 8 years ago

E.g. this one works fine:

find . -name "*.csv" -print0 | sort -z | xargs -r0 cat | ./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t M1 -d foo

However sometimes conversion fails when using multiple streams with the following error:

Traceback (most recent call last):
  File "./scripts/convert_csv_to_mt.py", line 417, in <module>
    FXT(CSV(args.inputFile), outputPath, timeframe, server, symbol, spread)
  File "./scripts/convert_csv_to_mt.py", line 256, in __init__
    header += pack('<i', int(firstUniBar['barTimestamp']))                          # FromDate - Date of first tick
TypeError: 'NoneType' object is not subscriptable

Possible failing scenario (given folder with backtest data: DS-EURUSD-2014 (change the branch to see csv files):

bt_size=$(find "DS-EURUSD-2014" -name '*.csv' -print0 | du -bc --files0-from=- | tail -n1 | cut -f1)
find DS-EURUSD-2014/ -name "*.csv" -print0 | sort -z | xargs -r0 cat | pv -s $bt_size | tee &>/dev/null   \
  >(./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t M1 -d output/)  \
  >(./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t M5 -d output/)  \
  >(./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t M15 -d output/) \
  >(./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t M30 -d output/) \
  >(./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t H1 -d output/)  \
  >(./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t H4 -d output/)  \
  >(./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t D1 -d output/)  \
  >(./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t W1 -d output/)  \
  >(./scripts/convert_csv_to_mt.py -i /dev/stdin -f hst4 -v -S default -s EURUSD -p 10 -t MN -d output/)

or given the following script (get_bt_data.sh, after this one: install_mt4.sh)

./get_bt_data.sh EURUSD 2014 DS

This happens, because we're reading data from /dev/stdin, and the data is not available yet at the time of reading (because of streaming). So Python should support stdin streaming somehow, or wait for the data? Since I think it can get half of the line instead.

The above syntax can be found in this commit, so to reproduce the problem you need to clone than repo, reset to that revision (git reset 75191cae6f1f58c656373bf3c845eb66e8af4991) and and run get_bt_data.sh EURUSD 2014 DS?

Sometimes it fails on:

Traceback (most recent call last):
  File "/Users/kenorb/.wine/drive_c/Program Files/FXCM MetaTrader 4/history/d
    FXT(CSV(args.inputFile), outputPath, timeframe, server, symbol, spread)
  File "/Users/kenorb/.wine/drive_c/Program Files/FXCM MetaTrader 4/history/d
    for tick in ticks:
  File "/Users/kenorb/.wine/drive_c/Program Files/FXCM MetaTrader 4/history/d
    return self._parseLine(line)
  File "/Users/kenorb/.wine/drive_c/Program Files/FXCM MetaTrader 4/history/d
    'timestamp': time.mktime(datetime.datetime.strptime(tick[0], '%Y.%m.%d %H
  File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/
    tt, fraction = _strptime(data_string, format)
  File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/
    (data_string, format))
ValueError: time data '2014.0' does not match format '%Y.%m.%d %H:%M:%S.%f'

See syntax in this commit.

Here is simple syntax which works:

cat /etc/hosts | tee >(wc -l) >(wc -l) >(wc -l) >(wc -l)

But it seems the current code of convert_csv_to_mt.py doesn't like it when we're trying to send the same data into multiple processes. See: http://stackoverflow.com/a/60955

--- Did you help close this issue? Go claim the **[$20 bounty](https://www.bountysource.com/issues/29942105-convert_csv_to_mt-py-typeerror-nonetype-object-is-not-subscriptable?utm_campaign=plugin&utm_content=tracker%2F20487492&utm_medium=issues&utm_source=github)** on [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F20487492&utm_medium=issues&utm_source=github).
kenorb commented 8 years ago

Actually this is happening currently in master. See: #119215322, so probably it's related.

kenorb commented 8 years ago

Update: Now it's not reproducible on master, so I don't know the steps to reproduce, so closing for now.

LemonBoy commented 8 years ago

The new code that reads the csv can't read from stdin anymore (due to the use of mmap), hence readline doesn't fail anymore and the error is now gone :)

kenorb commented 8 years ago

Ok, I don't know if we would need stdin, it would be useful to have syntax like:

cat something | convert_csv_to_mt.py -i -

or:

convert_csv_to_mt.py -i <(cat -)

Let me know if that's achievable. You can claim the bounty for this one as well then.

LemonBoy commented 8 years ago

It sure can be done by using sys.stdin which should be unbuffered, but You'd lose a chunk of the improved speed