quantopian / zipline

Zipline, a Pythonic Algorithmic Trading Library
https://www.zipline.io
Apache License 2.0
17.51k stars 4.71k forks source link

In zipline.run_algorithm(...) start and end date doesn't consider the time #2202

Open jjaviergalvez opened 6 years ago

jjaviergalvez commented 6 years ago

Dear Zipline Maintainers,

Before I tell you about my issue, let me describe my environment: name: myenv channels:

Environment

Now that you know a little about me, let me tell you about the issue I am having:

Description of Issue

I'm trying to work with local files with ziplines. My local file is minute by minute data. But when I run the algorithm with zipline.run_algorithm(...) I get an error regarding that some dates are not my data.

Here is how you can reproduce this issue on your machine:

Reproduction Steps

  1. Install my environment in conda
  2. Download the data from https://drive.google.com/file/d/1eGc5VaChqaPwLBQr_JZuQ_kb5YIa5NTR/view?usp=sharing
  3. Run the following code
    
    import pandas as pd
    from collections import OrderedDict
    import pytz

data = OrderedDict() data['SPY'] = pd.read_csv('AAPL.csv', index_col=0, parse_dates=[['Date', 'Timestamp']]) data['SPY'] = data['SPY'][['OpenPrice', 'HighPrice', 'LowPrice', 'ClosePrice', 'TotalVolume']] panel = pd.Panel(data) panel.minor_axis = ['open', 'high', 'low', 'close', 'volume'] panel.major_axis = panel.major_axis.tz_localize(pytz.utc)

from zipline.api import order, record, symbol, set_benchmark import zipline from datetime import datetime

def initialize(context): set_benchmark(symbol("SPY"))

def handle_data(context, data): order(symbol("SPY"), 10) record(SPY=data.current(symbol('SPY'), 'price'))

use start=datetime(2018, 1, 2, 9, 33, 0, 0, pytz.utc) to see that doesn't matter the time

perf = zipline.run_algorithm(start=datetime(2018, 1, 3, 9, 33, 0, 0, pytz.utc), end=datetime(2018, 2, 4, 9, 40, 0, 0, pytz.utc), initialize=initialize, capital_base=100000, handle_data=handle_data, data_frequency ='minute', data=panel)


...

## What steps have you taken to resolve this already?
I tried to install zipline via conda 

...

# Anything else?
Thanks in advance
...

Sincerely,
Javier Gálvez
jjaviergalvez commented 6 years ago

I tried with the following custom calendar:

from datetime import time
from zipline.utils.memoize import lazyval
from pandas.tseries.offsets import CustomBusinessDay
from pytz import timezone
from .trading_calendar import TradingCalendar

class TwentyFourHR(TradingCalendar):
    """
    Exchange calendar for 24/7 trading.

    Open Time: 12am, UTC
    Close Time: 11:59pm, UTC

    """
    @property
    def name(self):
        return "twentyfourhr"

    @property
    def tz(self):
        return timezone("UTC")

    @property
    def open_time(self):
        return time(9, 31)

    @property
    def close_time(self):
        return time(16)

    @lazyval
    def day(self):
        return CustomBusinessDay(
            weekmask='Mon Tue Wed Thu Fri',
        )

and it seems to work. So I suspect that part of the problem comes from the NYSEExchangeCalendar. I don't know why if in the class was declared as

@property
    def close_time(self):
        return time(16)

Why the error at time 16:01?

I modified to return time(16,0) and even return time(15) but the same error pops up.

The other issue is that the start parameter in the run_algorithm(...) function doesn't recognize the time.

freddiev4 commented 6 years ago

Hi @jjaviergalvez sorry for the delayed response. I haven't yet had the time to look into this more in-depth, but if you already have data in a .csv file, you can ingest it using the csvdir bundle, which we have docs for here.

Let me know if any of those docs are unclear.