scrtlabs / catalyst

An Algorithmic Trading Library for Crypto-Assets in Python
http://enigma.co
Apache License 2.0
2.49k stars 725 forks source link

fail to export minute data as a csv #101

Open apastore opened 6 years ago

apastore commented 6 years ago

Dear Catalyst Maintainers,

Before I tell you about my issue, let me describe my environment:

Environment

Now that you know a little about me, let me tell you about the issue I am having:

Description of Issue

unable to export minute csv file

I got this error

I got this [2017-12-15 16:42:38.620775] INFO: exchange_algorithm: initialized trading algorithm in backtest mode Traceback (most recent call last): File "get.coin.py", line 50, in capital_base=10000 ) File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/utils/run_algo.py", line 563, in run_algorithm stats_output=stats_output File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/utils/run_algo.py", line 346, in _run overwrite_sim_params=False, File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/exchange/exchange_algorithm.py", line 318, in run data, overwrite_sim_params File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/algorithm.py", line 724, in run for perf in self.get_generator(): File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/gens/tradesimulation.py", line 243, in transform self._get_minute_message(dt, algo, algo.perf_tracker) File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/gens/tradesimulation.py", line 303, in _get_minute_message dt, self.data_portal, File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/finance/performance/tracker.py", line 357, in handle_minute_close account.leverage) File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/finance/risk/cumulative.py", line 212, in update self.mean_benchmark_returns_cont[dt_loc] * 252 RuntimeWarning: overflow encountered in double_scalars

Here is how you can reproduce this issue on your machine:

Reproduction Steps

1) catalyst ingest-exchange -x poloniex -f minute -i btc_usdt 2) python get.coin.py poloniex btc_usdt

$cat get.coin.py

import os import csv import pytz from datetime import datetime import sys

from catalyst.api import record, symbol, symbols from catalyst.utils.run_algo import run_algorithm

print sys.argv[1] print sys.argv[2] pair = sys.argv[2] exchange_ID = sys.argv[1] def initialize(context):

Portfolio assets list

context.asset = symbol(pair) # Bitcoin on Poloniex context.csvfile = open(os.path.splitext( os.path.basename(pair))[0]+'_'+exchange_ID+'.csv', 'w+') context.csvwriter = csv.writer(context.csvfile)

def handle_data(context, data):

Variables to record for a given asset: price and volume

Other options include 'open', 'high', 'open', 'close'

Please note that 'price' equals 'close'

date = context.blotter.current_dt # current time in each iteration price = data.current(context.asset, 'price') volume = data.current(context.asset, 'volume') context.csvwriter.writerow([date,price,volume])

Writes one line to CSV on each iteration with the chosen variables

def analyze(context=None, results=None):

Close open file properly at the end

context.csvfile.close()

Bitcoin data is available from 2015-3-2. Dates vary for other tokens.

start = datetime(2017, 12, 1, 0, 0, 0, 0, pytz.utc) end = datetime.now(pytz.utc) print start print end results = run_algorithm(initialize=initialize, handle_data=handle_data, analyze=analyze, start=start, end=end, exchange_name=exchange_ID, data_frequency='minute', base_currency ='usdt', capital_base=10000 )

Thanks a lot for your help, alessandro

Error message

[2017-12-15 16:42:37.395114] INFO: run_algo: running algo in paper-trading mode [2017-12-15 16:42:38.619044] WARNING: Loader: Refusing to download new treasury data because a download succeeded at 2017-12-15 16:39:31+00:00. [2017-12-15 16:42:38.620775] INFO: exchange_algorithm: initialized trading algorithm in backtest mode Traceback (most recent call last): File "get.coin.py", line 50, in capital_base=10000 ) File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/utils/run_algo.py", line 563, in run_algorithm stats_output=stats_output File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/utils/run_algo.py", line 346, in _run overwrite_sim_params=False, File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/exchange/exchange_algorithm.py", line 318, in run data, overwrite_sim_params File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/algorithm.py", line 724, in run for perf in self.get_generator(): File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/gens/tradesimulation.py", line 243, in transform self._get_minute_message(dt, algo, algo.perf_tracker) File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/gens/tradesimulation.py", line 303, in _get_minute_message dt, self.data_portal, File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/finance/performance/tracker.py", line 357, in handle_minute_close account.leverage) File "/Users/pastorea/miniconda3/envs/catalyst/lib/python2.7/site-packages/catalyst/finance/risk/cumulative.py", line 212, in update self.mean_benchmark_returns_cont[dt_loc] * 252 RuntimeWarning: overflow encountered in double_scalars

fredfortier commented 6 years ago

It's having issue with benchmark data. We will review further shortly.

andresespinosapc commented 6 years ago

How is the fix of this issue going? I can't use the library because of this :(

jonathan-s commented 6 years ago

Also having this issue. For btc_usdt it throws the error on the 7th of December 20:49.

fredfortier commented 6 years ago

It looks like one of those issues with empyrical. I thought that we've worked around most of them, but it's not clean. We'll need to rebase to the latest zipline which includes a more recent version of the empyrical library. Can you better explain what you are trying to accomplish though. I'm not familiar with csvexport. Do you get this issue when running a regular algo or when exporting something?

On Wed, Jan 17, 2018 at 2:31 PM Jonathan Sundqvist notifications@github.com wrote:

Also having this issue.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/101#issuecomment-358416488, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ-QrHDobSgPX4UM1fY20VvdKrJ7_r2ks5tLkqGgaJpZM4RDwBL .

jonathan-s commented 6 years ago
import os
import csv
import pytz
from datetime import datetime

from catalyst.api import record, symbol, symbols
from catalyst.utils.run_algo import run_algorithm

def initialize(context):
    # Portfolio assets list
    context.asset = symbol('btc_usdt') # Bitcoin on Poloniex

    # Creates a .CSV file with the same name as this script to store results
    context.csvfile = open(os.path.splitext(os.path.basename('btc_usdt_error'))[0]+'.csv', 'w+')
    context.csvwriter = csv.writer(context.csvfile)
    context.csvwriter.writerow(['date', 'open', 'close', 'high', 'low', 'volume'])

def handle_data(context, data):
    # Variables to record for a given asset: price and volume
    # Other options include 'open', 'high', 'open', 'close'
    # Please note that 'price' equals 'close'
    date   = context.blotter.current_dt     # current time in each iteration
    opened = data.current(context.asset, 'open')
    close = data.current(context.asset, 'close')
    high = data.current(context.asset, 'high')
    low = data.current(context.asset, 'low')
    volume = data.current(context.asset, 'volume')

    # Writes one line to CSV on each iteration with the chosen variables
    context.csvwriter.writerow([date, opened, close, high, low, volume])

def analyze(context=None, results=None):
    # Close open file properly at the end
    context.csvfile.close()

# Bitcoin data is available from 2015-3-2. Dates vary for other tokens.
start = datetime(2017, 12, 6, 0, 0, 0, 0, pytz.utc)
end = datetime(2017, 12, 8, 0, 0, 0, 0, pytz.utc)
results = run_algorithm(
    initialize=initialize,
    handle_data=handle_data,
    analyze=analyze,
    start=start,
    end=end,
    exchange_name='poloniex',
    data_frequency='minute',
    base_currency ='usdt',
    capital_base=10000
)

Yes, effectively exporting the data to a csv.

You can run the above script and that will throw the error. I don't really know what happens in cumulative.py or what mean_benchmark_returns does. But suffice to say, it generates a very big number during the above script; 1.5 * 10^305 and when you multiply that with 252 you get the overflow.

fredfortier commented 6 years ago

Ok thanks. This helps troubleshooting further. We'll get back to you.

On Wed, Jan 17, 2018 at 2:52 PM Jonathan Sundqvist notifications@github.com wrote:

import os import csv import pytz from datetime import datetime

from catalyst.api import record, symbol, symbols from catalyst.utils.run_algo import run_algorithm

def initialize(context):

Portfolio assets list

context.asset = symbol('btc_usdt') # Bitcoin on Poloniex

# Creates a .CSV file with the same name as this script to store results
context.csvfile = open(os.path.splitext(os.path.basename('btc_usdt_error'))[0]+'.csv', 'w+')
context.csvwriter = csv.writer(context.csvfile)
context.csvwriter.writerow(['date', 'open', 'close', 'high', 'low', 'volume'])

def handle_data(context, data):

Variables to record for a given asset: price and volume

# Other options include 'open', 'high', 'open', 'close'
# Please note that 'price' equals 'close'
date   = context.blotter.current_dt     # current time in each iteration
opened = data.current(context.asset, 'open')
close = data.current(context.asset, 'close')
high = data.current(context.asset, 'high')
low = data.current(context.asset, 'low')
volume = data.current(context.asset, 'volume')

# Writes one line to CSV on each iteration with the chosen variables
context.csvwriter.writerow([date, opened, close, high, low, volume])

def analyze(context=None, results=None):

Close open file properly at the end

context.csvfile.close()

Bitcoin data is available from 2015-3-2. Dates vary for other tokens.

start = datetime(2017, 12, 6, 0, 0, 0, 0, pytz.utc) end = datetime(2017, 12, 8, 0, 0, 0, 0, pytz.utc) results = run_algorithm( initialize=initialize, handle_data=handle_data, analyze=analyze, start=start, end=end, exchange_name='poloniex', data_frequency='minute', base_currency ='usdt', capital_base=10000 )

You can run the above script and that will throw the error. I don't really know what happens in cumulative.py or what mean_benchmark_returns does. But suffice to say, it generates a very big number during the above script; 1.5 * 10^305 and when you multiply that with 252 you get the overflow.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/101#issuecomment-358423496, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ-QqFv7ITV16yGFqkfYXhyHnzO26o_ks5tLk-MgaJpZM4RDwBL .

zackgow commented 6 years ago

I had the same error earlier today while running a different script. When I changed my start_date, the issue was resolved. My theory was that the start date had some missing data, and that led to a divide by zero error in a pct_change calculation or something similar.

andresespinosapc commented 6 years ago

@zackgow Is not due to missing data. I do not get the error from 2017-12-02 to 2017-12-04 neither from 2017-12-03 to 2017-12-05, but I do get it from 2017-12-01 to 2017-12-05.

zackgow commented 6 years ago

@andresespinosapc 2017-12-01 was the date that gave me the error too. Changing that fixed my issues.

fredfortier commented 6 years ago

Thanks guys, this helps narrow down the root cause. Looking into it at the moment.

On Thu, Jan 18, 2018 at 2:25 PM zack notifications@github.com wrote:

@andresespinosapc https://github.com/andresespinosapc 2017-12-01 was the date that gave me the error too. Changing that fixed my issues.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/101#issuecomment-358751300, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ-QoD3rO4o1OD7x_NWZU4SBDjJrF-Eks5tL5fzgaJpZM4RDwBL .

salihkilic commented 6 years ago

Not sure if this issue is still open, but I'm getting the same problem @ 2017-12-08 07:45, when I do a backtest from 2017-12-3 to 2017-12-9. Hope this helps.

yeshymanoharan commented 6 years ago

I am also getting a similar error which doesnt seemed connected to a specific date, as I have other strategies that work with that date range, and the error always pops up 3 days after the startdate. Is there a solution to this, maybe an ability to prevent any calculations like mean_benchmark_returns_cont from being made made that would cause that overflow issue?