quantopian / zipline

Zipline, a Pythonic Algorithmic Trading Library
https://www.zipline.io
Apache License 2.0
17.51k stars 4.71k forks source link

Initialize algorithm start/end dates before running the initialize method #418

Open hahnicity opened 9 years ago

hahnicity commented 9 years ago

When I run the zipline CLI the context in my initialize function looks like this

TradingAlgorithm(
    capital_base=10000000.0
    sim_params=
SimulationParameters(
    period_start=2006-01-01 00:00:00+00:00,
    period_end=2006-12-31 00:00:00+00:00,
    capital_base=10000000.0,
    data_frequency=daily,
    emission_rate=daily,
    first_open=2006-01-03 14:31:00+00:00,
    last_close=2006-12-29 21:00:00+00:00),

These must be default values. Then my context in handle_data it matches my actual simulation

TradingAlgorithm(
    capital_base=10000000.0
    sim_params=
SimulationParameters(
    period_start=2010-01-04 00:00:00+00:00,
    period_end=2012-12-31 00:00:00+00:00,
    capital_base=10000000.0,
    data_frequency=daily,
    emission_rate=daily,
    first_open=2010-01-04 14:31:00+00:00,
    last_close=2012-12-31 21:00:00+00:00),

Note how the start and end dates are different; that is because these are the dates I am running my simulation under. This makes sense from the perspective if we are only monitoring price data from yahoo, but I would like to pull in additional data from outside sources not natively available to zipline. Naturally, I would like to do this in the initialize step, but I can't because initialize doesn't have dates that reflect my actual simulation parameters. I can do something hacky in handle_data like

def handle_data(context, data):
    if data[<STOCK>][dt] == context.period_start:
        context.new_data = gather_data_from_outside_sources(context.period_start, context.period_end)

But this feels like a step that deserves to be in initialize. Alternately you can add a pre_run step; but this feels kinda the same thing as calling the initialize function.

twiecki commented 9 years ago

This seems related to #421.

We have thought about calling initialize not in __init__ but in run() just before you actually start the simulation. Would this help you here?