quantopian / zipline

Zipline, a Pythonic Algorithmic Trading Library
https://www.zipline.io
Apache License 2.0
17.34k stars 4.68k forks source link

Using pd.Panel as a data source does not work as expected. #926

Closed elmq0022 closed 8 years ago

elmq0022 commented 8 years ago

I am on zipline version 0.8.3 working on windows with python 2.7. The items in Requirements.txt were installed via conda install and zipline was installed via pip.

I have a pd.Panel of ~500 assets that are not loading correctly into the algorithm.run(pd.Panel) routine.

Basically I was expecting the asset name (string) on the major_index to be converted to an appropriate object and mapped to a sid. This seems to only work about halfway.

The sid and symbol functions from zipline.api do seem to work after the data is loaded, but when I grab prices from the history function the resulting data frame has column names that are a list of integers (not assets?). So obviously the following does not work:

prices = history(100, '1d', 'price').dropna()

for stock, wt in zip(prices.columnames, wts):
    order_target_percent(stock, wt)

I tried to work around this by casting the integer to its asset using sid(stock). This pushed me past the initial error, but this still fails as cached price data only has integers as keys (instead of assets?). So when the order method tires to look up the price it throws a KeyError (all the keys are ints and I'm passing equity assets). I believe stack trace crashed in protocol.py

So, is this just user error on my part? Or, is something else off?

Any help is greatly appreciated. Thanks!

EDIT: I went back and tried the routine with data created from the load_bars_from_yahoo function and I get the same error. I feel like there is something odd is happening here.

llllllllll commented 8 years ago

Hi, thanks for the bug report. I think this might be related to an issue I found in TradingAlgorithm.run where sids were getting added to the asset finder twice. I will try to repro this and get back to you with what I find.

elmq0022 commented 8 years ago

Thank you for following up.

elmq0022 commented 8 years ago

I did some additional debugging. Basically I see bugs based on the transtion from the sid to generic assets design. Is this correct?

The order function validates that it received an asset object (which is incorrect?) in the validate_order_params function.

So, I was able to backtest with a monkey patched TradingAlgorithm by setting validate_order_params to pass. Ignoring this asset validation, the order goes though based on just the provided sid.

I'm sure this is not the best way to fix this, but it works (for now). I guess there are more general design issues to resolve which is complicated by several potential data sources.

Perhaps the construction of the data sources is the correct place to address this? Would passing a list of assets as the pd.Panel.index resolve this? I havent' tried it yet.

Please let me know if I'm on the right track.

Thanks!

llllllllll commented 8 years ago

Hey, I merged a change around asset handling for panel sources, could you retry your code with the code on master?

related pr: https://github.com/quantopian/zipline/pull/942

elmq0022 commented 8 years ago

I'll take a look as soon as I can, thanks.

On Tue, Jan 12, 2016 at 4:56 PM, Joe Jevnik notifications@github.com wrote:

Hey, I merged a change around asset handling for panel sources, could you retry your code with the code on master?

related pr: #942 https://github.com/quantopian/zipline/pull/942

— Reply to this email directly or view it on GitHub https://github.com/quantopian/zipline/issues/926#issuecomment-171088907.

AwesomeExcel commented 8 years ago

Sorry, that does not resolve the issue. As I said above, I believe this has to do with a transition from order methods requiring a SID to and ASSET. I have provided snippets of my code and the full stack traces as well.

The data should be correct as I am using

data = zipline.data.load_bars_from_yahoo(stocks = ['AAPL', 'CVX'], start='01-01-2010', end='01-01-2015' )

FUNCTION AND STACK TRACE WITH AN ASSEST PASSED

def buy_positions(context, data):
    if context.day < context.lookback:
        return

    price = history(context.lookback, context.freq, field='price')

    # drop the symbols no longer traded
    # drop NA
    price.drop([s for s in price.columns if s not in data], axis = 1, inplace = True)
    price.dropna(axis=1, inplace=True)

    returns = np.log(price).diff(1)[1:].T

    # if returns.shape[1] >= 25:
    wt=opt_max_return(returns)
    print(wt[:10])
    for symbol, amount in zip(price.columns, wt):
        # set_trace()
        order_target_percent(sid(symbol), amount)
Traceback (most recent call last):
  File "c:\Users\X\Desktop\Project\myrepo\local_mvo.py", line 178, in <module>
    algo_obj.run(data)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 592, in run
    for perf in self.gen:
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\gens\tradesimulation.py", line 121, in transform
    self.algo.instant_fill,
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\gens\tradesimulation.py", line 298, in _process_snapshot
    new_orders = self._call_handle_data()
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\gens\tradesimulation.py", line 327, in _call_handle_data
    self.simulation_dt,
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\utils\events.py", line 209, in handle_data
    context.trading_environment,
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\utils\events.py", line 228, in handle_data
    self.callback(context, data)
  File "c:\Users\X\Desktop\Project\myrepo\local_mvo.py", line 126, in buy_positions
    order_target_percent(sid(symbol), amount)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\utils\api_support.py", line 51, in wrapped
    return getattr(get_algo_instance(), f.__name__)(*args, **kwargs)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 1186, in order_target_percent
    style=style)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 1164, in order_target_value
    target_amount = self._calculate_order_value_amount(sid, target)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 850, in _calculate_order_value_amount
    last_price = self.trading_client.current_data[asset].price
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\protocol.py", line 528, in __getitem__
    return self._data[name]
KeyError: Equity(0, symbol=u'AAPL', asset_name=None, exchange=None, start_date=Timestamp('1970-01-01 00:00:00+0000', tz='UTC'), end_date=Timestamp('2116-02-20 23:53:38.427387903+00 00', tz='UTC'), first_traded=None)

FUNCTION AND STACK TRACE WITH AN SID PASSED

def buy_positions(context, data):
    if context.day < context.lookback:
        return

    price = history(context.lookback, context.freq, field='price')

    # drop the symbols no longer traded
    # drop NA
    price.drop([s for s in price.columns if s not in data], axis = 1, inplace = True)
    price.dropna(axis=1, inplace=True)

    returns = np.log(price).diff(1)[1:].T

    # if returns.shape[1] >= 25:
    wt=opt_max_return(returns)
    print(wt[:10])
    for symbol, amount in zip(price.columns, wt):
        # set_trace()
        order_target_percent(symbol, amount)
Traceback (most recent call last):
  File "c:\Users\X\Desktop\Project\myrepo\local_mvo.py", line 178, in <module>
    algo_obj.run(data)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 592, in run
    for perf in self.gen:
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\gens\tradesimulation.py", line 121, in transform
    self.algo.instant_fill,
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\gens\tradesimulation.py", line 298, in _process_snapshot
    new_orders = self._call_handle_data()
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\gens\tradesimulation.py", line 327, in _call_handle_data
    self.simulation_dt,
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\utils\events.py", line 209, in handle_data
    context.trading_environment,
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\utils\events.py", line 228, in handle_data
    self.callback(context, data)
  File "c:\Users\X\Desktop\Project\myrepo\local_mvo.py", line 126, in buy_positions
    order_target_percent(symbol, amount)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\utils\api_support.py", line 51, in wrapped
    return getattr(get_algo_instance(), f.__name__)(*args, **kwargs)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 1186, in order_target_percent
    style=style)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 1168, in order_target_value
    style=style)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 1150, in order_target
    style=style)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 897, in order
    style)
  File "C:\Miniconda3\envs\zip_test\lib\site-packages\zipline-0.8.3+232.g0dac2e0-py2.7-win-amd64.egg\zipline\algorithm.py", line 936, in validate_order_params
    msg="Passing non-Asset argument to 'order()' is not supported."
zipline.errors.UnsupportedOrderParameters: Passing non-Asset argument to 'order()' is not supported. Use 'sid()' or 'symbol()' methods to look up an Asset.

MONKEY PATCH TO FIX ISSUE PASS A SID TO ORDER_TARGET_PERCENT

def pass_validate_order_params(asset, amount, limit_price, stop_price, style):
   pass

algo_obj = TradingAlgorithm(initialize=initialize, handle_data=handle_data, analyze=analyze, instant_fill = True)
algo_obj.validate_order_params = pass_validate_order_params
llllllllll commented 8 years ago

Hey, we will try to take a look later this week. Sorry for the delay.

richafrank commented 8 years ago

I can reproduce this now - will look into it...

elmq0022 commented 8 years ago

Sounds great. Thanks. On Jan 19, 2016 10:00 PM, "Richard Frank" notifications@github.com wrote:

I can reproduce this now - will look into it...

— Reply to this email directly or view it on GitHub https://github.com/quantopian/zipline/issues/926#issuecomment-173078716.

richafrank commented 8 years ago

This should be fixed by #959, which is now on master.

sleepykid commented 8 years ago

I suffer the same problem when using a custom datasource. my datasource object takes in a custom layout dataframe and implements: mapping,raw_data_gen,raw_data and worked perfectly with zipline 0.7.0

after finding this issue, I copied what AwesomeExcel did with the Monkey Patch. and now I can run on zipline 0.8.4.

Is it possible to get this fixed for a a custom datasource too? thanks