scrtlabs / catalyst

An Algorithmic Trading Library for Crypto-Assets in Python
http://enigma.co
Apache License 2.0
2.49k stars 725 forks source link

failed catalyst run leaves empty file in /var/tmp/catalyst/data/poloniex/crypto_prices-USDT_BTC.csv #16

Closed rterbush closed 7 years ago

rterbush commented 7 years ago

Dear Catalyst Maintainers,

Before I tell you about my issue, let me describe my environment:

Environment

Now that you know a little about me, let me tell you about the issue I am having:

Description of Issue

Running the following command fails to download data (backtrace below) catalyst run -f buy_and_hodl.py --start 2015-3-1 --end 2017-6-28 --capital-base 100000 -o bah.pickle

Following getting started guide.

[2017-07-28 17:59:24.317488] INFO: Loader: Downloading benchmark data for 'USDT_BTC' from 1989-12-31 00:00:00+00:00 to 2017-07-26 00:00:00+00:00 Traceback (most recent call last): File "/Users/randy/.virtualenvs/catalyst2/bin/catalyst", line 11, in sys.exit(main()) File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/click/core.py", line 722, in call return self.main(args, kwargs) File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, ctx.params) File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/click/core.py", line 535, in invoke return callback(args, kwargs) File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/catalyst/main.py", line 97, in _ return f(*args, *kwargs) File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/click/decorators.py", line 17, in new_func return f(get_current_context(), args, kwargs) File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/catalyst/main.py", line 240, in run environ=os.environ, File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/catalyst/utils/run_algo.py", line 196, in _run env, data = get_trading_env_and_data(bundles) File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/catalyst/utils/run_algo.py", line 156, in get_trading_env_and_data environ=environ, File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/catalyst/finance/trading.py", line 99, in init self.bm_symbol, File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/catalyst/data/loader.py", line 135, in load_crypto_market_data environ, File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/catalyst/data/loader.py", line 367, in ensure_crypto_benchmark_data if not has_data_for_dates(daily_close, first_date, last_date): File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/catalyst/data/loader.py", line 96, in has_data_for_dates first, last = dts[[0, -1]] File "/Users/randy/.virtualenvs/catalyst2/lib/python2.7/site-packages/pandas/tseries/base.py", line 276, in getitem result = getitem(key) IndexError: index 0 is out of bounds for axis 1 with size 0



Here is how you can reproduce this issue on your machine:

## Reproduction Steps

1. Follow getting started guide on these platforms.
2. Failed download of crypto_prices-USDT_BTC.csv occurs due to any number of reasons
3. Every subsequent run will fail.
...

## What steps have you taken to resolve this already?
I have confirmed that this works as expected on Debian Linux. Confirmed that it failed on these two platforms for both py2 and py3. Other users have confirmed the same behavior on MacOS in Slack #dev
lacabra commented 7 years ago

Adding more information about this issue: the file that holds the benchmark (~/.catalyst/data/USDT_BTC_benchmark.csv) is empty

ScottStevenson commented 7 years ago

Exact same issue here. I believe Zipline (which this is based on) had a similar issue when I tried to use it - due to either Google Finance or Yahoo Finance shutting down an API?

rterbush commented 7 years ago

This issue appears to be caused by failure of previous data handling ops. It can be worked around with the following procedure.

  1. Must use Python version 2.7.x due to some compatibility issues in file opens in curate/poloniex.py it appears.
  2. rm -rf /var/tmp/catalyst
  3. rm -rf ~/.catalyst
  4. catalyst ingest
  5. catalyst run ...

After chasing this around a bit, it seems that we are only checking for existence of files rather than whether the file has valid content before moving to next step of data curation.

rterbush commented 7 years ago

Changed the title of this issue since I have now recreated it on Linux.

If a catalyst run fails, it may leave an empty file in /var/tmp/catalyst/data/poloniex/crypto_prices-USDT_BTC.csv.

This empty file will prevent any future runs from succeeding. This appears to be because we are not handling an existing empty file error condition.

lacabra commented 7 years ago

Addressed in commit a01bcd538a31058d1c87c146d6137c495ae2da2e, which is part of 0.1.dev7 release: 20a98a32ca8ba60f295b05423e2afb93812e84bc in the master branch.