quantopian / zipline

Zipline, a Pythonic Algorithmic Trading Library
https://www.zipline.io
Apache License 2.0
17.52k stars 4.71k forks source link

Dividends and splits in a minute-level CSV data bundle #2453

Open 11k opened 5 years ago

11k commented 5 years ago

Dear Zipline Maintainers,

Before I tell you about my issue, let me describe my environment:

Environment

* Operating System: `macOS High Sierra 10.13.6` * Python Version: `3.5.7` * Python Bitness: `64` * How did you install Zipline: `pip install zipline` * Python packages: ``` alembic==1.0.8 bcolz==0.12.1 Bottleneck==1.2.1 certifi==2019.3.9 chardet==3.0.4 Click==7.0 contextlib2==0.5.5 cyordereddict==1.0.0 Cython==0.29.6 decorator==4.4.0 empyrical==0.5.0 idna==2.8 intervaltree==3.0.2 Logbook==1.4.3 lru-dict==1.1.6 lxml==4.3.3 Mako==1.0.8 MarkupSafe==1.1.1 mock==2.0.0 multipledispatch==0.6.0 networkx==1.11 numexpr==2.6.9 numpy==1.16.2 pandas==0.22.0 pandas-datareader==0.7.0 patsy==0.5.1 pbr==5.1.3 python-dateutil==2.8.0 python-editor==1.0.4 pytz==2018.9 requests==2.21.0 requests-file==1.4.3 scipy==1.2.1 six==1.12.0 sortedcontainers==2.1.0 SQLAlchemy==1.3.2 statsmodels==0.9.0 tables==3.5.1 toolz==0.9.0 trading-calendars==1.7.0 urllib3==1.24.1 wrapt==1.11.1 zipline==1.3.0 ```

Now that you know a little about me, let me tell you about the issue I am having:

Description of Issue

I have some minute-level CSV data I'd like to import into zipline as a data bundle. The prices in each row are unadjusted for dividends and splits. I'm looking for some guidance on how to properly set the values in the dividend and split columns. For instance, if a dividend has a date of 3/1/19 for $0.29, would I simply put 0.29 for the first minute of that day?

date,open,high,low,close,volume,dividend,split
...
2019-02-28 20:55:00,11.065,11.07,11.05,11.065,82716,0.0,1.0
2019-02-28 20:56:00,11.065,11.07,11.06,11.06,70202,0.0,1.0
2019-02-28 20:57:00,11.06,11.07,11.05,11.065,63063,0.0,1.0
2019-02-28 20:58:00,11.065,11.07,11.06,11.07,48585,0.0,1.0
2019-02-28 20:59:00,11.07,11.09,11.0601,11.08,219572,0.0,1.0
2019-02-28 21:00:00,11.15,11.15,11.07,11.1,115408,0.0,1.0
2019-03-01 14:31:00,11.0905,11.1,11.03,11.03,53544,0.29,1.0 # Dividend
2019-03-01 14:32:00,11.04,11.06,11.03,11.05,39980,0.0,1.0
2019-03-01 14:33:00,11.06,11.11,11.06,11.11,184266,0.0,1.0
2019-03-01 14:34:00,11.11,11.145,11.1,11.11,70435,0.0,1.0
...

And how would I set the split value? Let's say a stock has a date of 3/1/19 for a 2-for-1 split (two shares given for each share owned), is this the proper way to convey that?

date,open,high,low,close,volume,dividend,split
...
2019-02-28 20:55:00,11.065,11.07,11.05,11.065,82716,0.0,1.0
2019-02-28 20:56:00,11.065,11.07,11.06,11.06,70202,0.0,1.0
2019-02-28 20:57:00,11.06,11.07,11.05,11.065,63063,0.0,1.0
2019-02-28 20:58:00,11.065,11.07,11.06,11.07,48585,0.0,1.0
2019-02-28 20:59:00,11.07,11.09,11.0601,11.08,219572,0.0,1.0
2019-02-28 21:00:00,11.15,11.15,11.07,11.1,115408,0.0,1.0
2019-03-01 14:31:00,11.0905,11.1,11.03,11.03,53544,0.0,2.0 # Split
2019-03-01 14:32:00,11.04,11.06,11.03,11.05,39980,0.0,2.0
2019-03-01 14:33:00,11.06,11.11,11.06,11.11,184266,0.0,2.0
2019-03-01 14:34:00,11.11,11.145,11.1,11.11,70435,0.0,2.0
...

Anything else?

Additionally, on which "date" of the dividend/split should the values be set? The payable date, "ex" date, or record date?

Sincerely, 11k

HamedShafiee commented 5 years ago

and How you can ingest minute data with split/dividend? as I get some error when I have value other than zero on dividends!

angelorodem commented 4 years ago

pity that no one answer this, there is no documentation on this.