quantopian / zipline

Zipline, a Pythonic Algorithmic Trading Library
https://www.zipline.io
Apache License 2.0
17.62k stars 4.72k forks source link

Error retrieving H15 interest rates - ValueError: 'Time Period' is not in list #1817

Closed rudolf-bauer closed 7 years ago

rudolf-bauer commented 7 years ago

Dear Zipline Maintainers,

Environment

Error retrieving H15 interest rates - ValueError: 'Time Period' is not in list

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

...
env = TradingEnvironment(bm_symbol=self.benchmark, exchange_tz=self.exchange_tz,
--> 539                                  trading_calendar=cal)
...

/opt/conda/envs/develop/lib/python3.5/site-packages/zipline/finance/trading.py in __init__(self, load, bm_symbol, exchange_tz, trading_calendar, asset_db_path)
     94             trading_calendar.day,
     95             trading_calendar.schedule.index,
---> 96             self.bm_symbol,
     97         )
     98 

/opt/conda/envs/develop/lib/python3.5/site-packages/zipline/data/loader.py in load_market_data(trading_day, trading_days, bm_symbol)
    169         first_date,
    170         last_date,
--> 171         now,
    172     )
    173     benchmark_returns = br[br.index.slice_indexer(first_date, last_date)]

/opt/conda/envs/develop/lib/python3.5/site-packages/zipline/data/loader.py in ensure_treasury_data(bm_symbol, first_date, last_date, now)
    317 
    318     try:
--> 319         data = loader_module.get_treasury_data(first_date, last_date)
    320         data.to_csv(path)
    321     except (OSError, IOError, HTTPError):

/opt/conda/envs/develop/lib/python3.5/site-packages/zipline/data/treasuries.py in get_treasury_data(start_date, end_date)
     74         parse_dates=['Time Period'],
     75         na_values=['ND'],  # Presumably this stands for "No Data".
---> 76         index_col=0,
     77     ).loc[
     78         start_date:end_date

/opt/conda/envs/develop/lib/python3.5/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
    644                     skip_blank_lines=skip_blank_lines)
    645 
--> 646         return _read(filepath_or_buffer, kwds)
    647 
    648     parser_f.__name__ = name

/opt/conda/envs/develop/lib/python3.5/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
    387 
    388     # Create the parser.
--> 389     parser = TextFileReader(filepath_or_buffer, **kwds)
    390 
    391     if (nrows is not None) and (chunksize is not None):

/opt/conda/envs/develop/lib/python3.5/site-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds)
    728             self.options['has_index_names'] = kwds['has_index_names']
    729 
--> 730         self._make_engine(self.engine)
    731 
    732     def close(self):

/opt/conda/envs/develop/lib/python3.5/site-packages/pandas/io/parsers.py in _make_engine(self, engine)
    921     def _make_engine(self, engine='c'):
    922         if engine == 'c':
--> 923             self._engine = CParserWrapper(self.f, **self.options)
    924         else:
    925             if engine == 'python':

/opt/conda/envs/develop/lib/python3.5/site-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
   1434                 raise ValueError("Usecols do not match names.")
   1435 
-> 1436         self._set_noconvert_columns()
   1437 
   1438         self.orig_names = self.names

/opt/conda/envs/develop/lib/python3.5/site-packages/pandas/io/parsers.py in _set_noconvert_columns(self)
   1484                         _set(k)
   1485                 else:
-> 1486                     _set(val)
   1487 
   1488         elif isinstance(self.parse_dates, dict):

/opt/conda/envs/develop/lib/python3.5/site-packages/pandas/io/parsers.py in _set(x)
   1474 
   1475             if not is_integer(x):
-> 1476                 x = names.index(x)
   1477 
   1478             self._reader.set_noconvert(x)

ValueError: 'Time Period' is not in list

What steps have you taken to resolve this already?

Sincerely,

Rudi

ivanlen commented 7 years ago

I am having the same problem as you, I didn't change anything in the zipline code.

utsavkesharwani commented 7 years ago

Same issue here as well

sjhddh commented 7 years ago

same here

pbharrin commented 7 years ago

+1

justinlent commented 7 years ago

Me too… perhaps the source data URL modified it’s format? -Justin

On Fri, May 26, 2017 at 6:34 AM, Peter Harrington notifications@github.com wrote:

+1

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/quantopian/zipline/issues/1817#issuecomment-304283841, or mute the thread https://github.com/notifications/unsubscribe-auth/AGPO3lDYPyKF5g17Wxr3Y7s4x_fQK7d8ks5r9tT_gaJpZM4NnCoj .

freddiev4 commented 7 years ago

I've just merged #1818 and also ran get_treasury_data() in master and didn't have any issues. If nobody else is seeing this I'll go ahead and close this issue 😃 (feel free to re-open if you're still seeing problems)

Also https://github.com/quantopian/zipline/pull/1812

rudolf-bauer commented 7 years ago

Hi @FreddieV4, thx a lot, this solves my problems.

nmurray941 commented 6 years ago

Im having the same problems but in the treasury script its already set to "label=include", what else can I do to resolve this issue?

nicolasferrari commented 6 years ago

Hello, I'm having the same issue. While I run a backtest , I got the value error 'Time Period' is not in list.

The code work before, but now have that trouble. Any clue to solve this?

pbharrin commented 6 years ago

I just got this error and the treasury data did not get updated today. I copied the data from yesterday in the file ~/.zipline/data/treasury_curves.csv and it started working. (I made t-1's data exactly the same as t-2's) This isn't the best solution, but it isn't the worst thing in the world. Need to find the root cause.

nicolasferrari commented 6 years ago

Thanks for the response pbharrin. Yes, it was surprising because few days ago I can run the strategy and have the desire output. But today, I got that error all times.

I will try with your advice!! Nicolás

nicolasferrari commented 6 years ago

Just for curious: why the source of zipline(treasury function) has that data to retrieve(the h15 interest rate), in order to run the zipline strategy?

pbharrin commented 6 years ago

The calculation of Sharpe ratio is (Your mean return - risk free rate)/std. So they use the treasury data to get the risk free rate. The risk free rate has been effectively 0 for so long that many calculations leave out the risk free rate. Many of the Zipline problems are caused by this and the SPY data which is only used to show a comparison of your algorithm to a benchmark, however SPY may not be the right benchmark. What if you are trading Japanese stocks, or small caps? If you are trading a long/short strategy than the SPY isn't a good benchmark as assumes long only.

nicolasferrari commented 6 years ago

Hello pbharrin, thanks for the explanation. I suspect now, that I have problems with the SPY data. That is my sense after reading the trace of the error that I'm receiving now. The output error is: "IndexError: index 0 is out of bounds for axis 0 with size 0". It seems that the data of SPY is not load properly.

The trace of the error is :

IndexError Traceback (most recent call last)

in () 74 capital_base = 100000, 75 handle_data = handle_data, ---> 76 data = panel) 77 c:\users\nicolas\lib\site-packages\zipline\utils\run_algo.py in run_algorithm(start, end, initialize, capital_base, handle_data, before_trading_start, analyze, data_frequency, data, bundle, bundle_timestamp, trading_calendar, metrics_set, default_extension, extensions, strict_extensions, environ) 396 metrics_set=metrics_set, 397 local_namespace=False, --> 398 environ=environ, 399 ) c:\users\nicolas\lib\site-packages\zipline\utils\run_algo.py in _run(handle_data, initialize, before_trading_start, analyze, algofile, algotext, defines, data_frequency, capital_base, data, bundle, bundle_timestamp, start, end, output, trading_calendar, print_algo, metrics_set, local_namespace, environ) 173 ) 174 else: --> 175 env = TradingEnvironment(environ=environ) 176 choose_loader = None 177 c:\users\nicolas\lib\site-packages\zipline\finance\trading.py in __init__(self, load, bm_symbol, exchange_tz, trading_calendar, asset_db_path, future_chain_predicates, environ) 97 trading_calendar.day, 98 trading_calendar.schedule.index, ---> 99 self.bm_symbol, 100 ) 101 c:\users\nicolas\lib\site-packages\zipline\data\loader.py in load_market_data(trading_day, trading_days, bm_symbol, environ) 154 last_date, 155 now, --> 156 environ, 157 ) 158 c:\users\nicolas\lib\site-packages\zipline\data\loader.py in ensure_treasury_data(symbol, first_date, last_date, now, environ) 263 264 data = _load_cached_data(filename, first_date, last_date, now, 'treasury', --> 265 environ) 266 if data is not None: 267 return data c:\users\nicolas\lib\site-packages\zipline\data\loader.py in _load_cached_data(filename, first_date, last_date, now, resource_name, environ) 309 data = from_csv(path) 310 data.index = data.index.to_datetime().tz_localize('UTC') --> 311 if has_data_for_dates(data, first_date, last_date): 312 return data 313 c:\users\nicolas\lib\site-packages\zipline\data\loader.py in has_data_for_dates(series_or_df, first_date, last_date) 84 if not isinstance(dts, pd.DatetimeIndex): 85 raise TypeError("Expected a DatetimeIndex, but got %s." % type(dts)) ---> 86 first, last = dts[[0, -1]] 87 return (first <= first_date) and (last >= last_date) 88 c:\users\nicolas\lib\site-packages\pandas\tseries\base.py in __getitem__(self, key) 173 attribs['freq'] = freq 174 --> 175 result = getitem(key) 176 if result.ndim > 1: 177 return result IndexError: index 0 is out of bounds for axis 0 with size 0
nicolasferrari commented 6 years ago

Any help about this?

Thanks!

pbharrin commented 6 years ago

IT still looks like it is an issue with the Treasury data as you have this line c:\users\nicolas\lib\site-packages\zipline\data\loader.py in ensure_treasury_data(symbol, first_date, last_date, now, environ) in the traceback.

Do you really need the treasury or SPY data? There are quick hacks that you can do to get rid of these errors.

pbharrin commented 6 years ago

Also I heard that the latest Zipline build no longer uses treasury data. Are you using the latest build?

nicolasferrari commented 6 years ago

My version of Zipline is 1.2 I think. Is an issue with the treasury data yes, as the treasury_curves.csv file is empty, after I run the algorithm.

Im not need the treasury_data, how I can get rid of this?

kanatm287 commented 5 years ago

Looks like fed changed request params and csv file again.

def get_treasury_data(start_date, end_date): return pd.read_csv( "https://www.federalreserve.gov/datadownload/Output.aspx" "?rel=H15" "&series=bf17364827e38702b42a58cf8eaa3f78" "&lastObs=" "&from=" # An unbounded query is ~2x faster than specifying dates. "&to=" "&filetype=csv" "&label=omit" "&layout=seriescolumn" "&type=package", skiprows=1, # First row are useless headers. parse_dates=['Time Period'], na_values=['ND'], # Presumably this stands for "No Data". index_col=0, ).loc[ start_date:end_date ].dropna( how='all' ).rename( columns=parse_treasury_csv_column ).tz_localize('UTC') * 0.01 # Convert from 2.57% to 0.0257.

works with above request params as usual.

Galsor commented 5 years ago

Thanks kanatm287. Seems to work with me !

pbharrin commented 5 years ago

Thanks kanatm287 you are a hero to a generation. It looks like the param skiprows was changed from 5 to 1, did I miss anything else?

redallica commented 5 years ago

@pbharrin label has been changed to omit again ? &label=omit"

Just changed skiprows to 1 and label to omit, and it is still not working for me.

Edit: actually, I just changed skiprows = 1 and kept label = include and it is still not working. Don't know what to do else. Any help would be greatly appreciated

nateGeorge commented 5 years ago

Currently I'm getting this error, I think because the website isn't working properly right now. Shouldn't we be also catching the ValueError in the get_treasure_data() function in treasuries.py? Maybe the data should be cached elsewhere? Or a copy updated with zipline with each release? Don't be able to seem to run zipline backtesting at all if we can't download this file at least once.

The URL seems to be

https://www.federalreserve.gov/datadownload/Output.aspx?rel=H15&series=bf17364827e38702b42a58cf8eaa3f78&lastObs=&from=&to=&filetype=csv&label=include&layout=seriescolumn&type=package

screenshot from 2019-02-13 19-38-19

imhgchoi commented 5 years ago

Same issue here... Does this happen periodically?

nateGeorge commented 5 years ago

Actually, this seems to be happening on another machine I have with custom CSV data where it was working before. I guess hopefully the fed site is fixed tomorrow, but it would be good to have a better solution for handling this problem in the future.

Backtesting seems to work for the quandl dataset, but not for a custom csv dataset I have on a computer where zipline was installed and I ran the last backtest last october sometime.

imhgchoi commented 5 years ago

That's exactly what is going on for me too. I'm as well using my custom CSV file for the data bundle, and it worked just fine until yesterday. I hope this isn't a long term issue.

And by the way, anybody knows if it's necessary to download data from the fed site? I'm using a dataset totally irrelevant to the US fed, so maybe I can somehow block the transaction with the site? I'm not so sure what kind of data the zipline code was trying to import from fed, since I haven't looked into it previously, but I'm wondering if there are any get arounds if possible.

ksyme99 commented 5 years ago

I have just this morning started getting this error also - not sure what started causing it since one minute it worked and the next it didn't. I'm using a custom data bundle and have no need to actually use the treasury data. Going to try and hack it out for now but having this fail more gracefully would be good.

praghavan039 commented 5 years ago

I am facing same issue. 'Time Period' is not in list

pbharrin commented 5 years ago

Just a reminder the treasury data gives us the "risk free rate" which is used in the Sharpe calculation.

nateGeorge commented 5 years ago

Fix for the future times when the Fed's site is down: https://github.com/nateGeorge/treasury_data_backup