Error: No bundle registered - even though it actually is. #2615

Open jaepil-choi opened 4 years ago

jaepil-choi commented 4 years ago

Dear Zipline Maintainers,

Before I tell you about my issue, let me describe my environment:


* Operating System: (Windows Version or `$ uname --all`): **Windows-10-10.0.18362-SP0** * Python Version: `$ python --version`: **Python 3.5.6 :: Anaconda, Inc.** : Running on Jupyter Lab. * Python Bitness: `$ python -c 'import math, sys;print(int(math.log(sys.maxsize + 1, 2) + 1))'` **64** * How did you install Zipline: (`pip`, `conda`, or `other (please explain)`) **conda clean install within env python=3.5**

Now that you know a little about me, let me tell you about the issue I am having:

Description of Issue

Here is how you can reproduce this issue on your machine:

Reproduction Steps

  1. Get AAPL data from yahoo, using pandas_datareader
import pandas as pd
import pandas_datareader.data as web
from zipline.api import order, symbol, record
from zipline.data.bundles import register
from zipline.data.bundles.csvdir import csvdir_equities
import datetime
import matplotlib.pyplot as plt
import os, sys, platform

start = datetime.datetime(2015, 1, 1)
end = datetime.datetime(2020, 1, 10)

data = web.DataReader('AAPL', 'yahoo', start, end)
  1. Change column names to fit zipline's data format (OHLCV + dividend, split)
data = data[['Open', 'High', 'Low', 'Adj Close', 'Volume']] # To OHLCV format. 
data = data.rename(columns={'Open':'open', 'High':'high', 'Low':'low', 'Adj Close':'close', 'Volume':'volume'})
data.index.names = ['date']
data['dividend'] = 0
data['split'] = 1


    open    high    low close   volume  dividend    split
2020-01-06  293.790009  299.959991  292.750000  299.799988  29596800.0  0   1
2020-01-07  299.839996  300.899994  297.480011  298.390015  27218000.0  0   1
2020-01-08  297.160004  304.440002  297.160004  303.190002  33019800.0  0   1
2020-01-09  307.239990  310.429993  306.200012  309.630005  42527100.0  0   1
2020-01-10  310.600006  312.670013  308.250000  310.329987  35161200.0  0   1


    open    high    low close   volume  dividend    split
2014-12-31  112.820000  113.129997  110.209999  101.419060  41403400.0  0   1
2015-01-02  111.389999  111.440002  107.349998  100.454300  53204600.0  0   1
2015-01-05  108.290001  108.650002  105.410004  97.624336   64285500.0  0   1
2015-01-06  106.540001  107.430000  104.629997  97.633545   65797100.0  0   1
2015-01-07  107.199997  108.199997  106.699997  99.002556   40105900.0  0   1
  1. Export it to csv
  2. Follow custom csv ingesting tutorial in the documentation, then you get an error.
# We’ll then want to specify the start and end sessions of our bundle data:
start_session = pd.Timestamp('2014-12-31', tz='utc')
end_session = pd.Timestamp('2020-01-10', tz='utc')

AAPL_path = os.getcwd() + '\AAPL.csv'

# And then we can register() our bundle, and pass the location of the directory in which our .csv files exist:
    calendar_name='NYSE', # US equities

C:\Users\Jaepil\Anaconda3\envs\finance35\lib\site-packages\ipykernel_launcher.py:11: UserWarning: Overwriting bundle with name 'custom-csvdir-bundle' #This is added back by InteractiveShellApp.init_path()

! zipline ingest -b custom-csvdir-bundle 
# Which is equivalent to: $ zipline ingest -b ... 

Error: No bundle registered with the name 'custom-csvdir-bundle'


What steps have you taken to resolve this already?

I can't understand how my custom bundle could be overwritten (which indicates that it has already existed) and not have been registered at the same time.

I searched existing issues and tried this to change C:\Users\Jaepil\Anaconda3\envs\finance35\Lib\site-packages\zipline\data\bundles\csvdir.py

And so I changed the line from . import core as bundles to from zipline.data.bundles import core as bundles; the extension.py doesn't know about the . the quandl bundle is referencing because the extension code lives in .zipline/.

However, it didn't work. @freddiev4 seems to have managed to make 'No bundle' issue go away but I didn't even get there.


Anything else?

On zipline's official documentation, it says:

Once you have your data in the correct format, you can edit your extension.py file in ~/.zipline/extension.py and import the csvdir bundle, along with pandas.

However, it is extremely unclear how I should edit my extension.py. I really could use some help.


gpeevans commented 3 years ago

Hi, here are a few comments as I faced a similar issue.

Following the example for loading the custom bundle, firstly from the cli simply run zipline ingest. This should download the quantopian-quandl bundle to the zipline root folder. In windows this was "C:\Users\user.zipline" and an "extension.py" file appeared in this folder. Update the below code as advised by the how-to guide in that file and from the cli run "python extension.py" to ingest the csv data that you prepared.

Note, it is a bit tricky working between Jupyter and the cli as "zipline bundle" from the cli will show registered bundles however "bundle.bundles" when called from Jupyter will not show the same registered even when working in the same conda env. Stepping through the My First Algorithm part helped clear this up a little, accessing the custom csv bundle via cell magic worked a treat.


import pandas as pd
from zipline.data.bundles import register
from zipline.data.bundles.csvdir import csvdir_equities
start_session = pd.Timestamp('2012-1-3', tz='utc')
end_session = pd.Timestamp('2014-12-31', tz='utc')
    calendar_name='NYSE', # US equities