quantopian / zipline

Zipline, a Pythonic Algorithmic Trading Library
https://www.zipline.io
Apache License 2.0
17.57k stars 4.71k forks source link

Example script on Windows 7 gives Unicode Error #2036

Closed jengelman closed 6 years ago

jengelman commented 6 years ago

Environment

I installed from pip, and successfully ran zipline ingest. I closed the repo from github, went to the directory, and attempted to run the example script as follows: zipline run -f zipline/examples/buyapple.py --start 2000-1-1 --end 2014-1-1 -o buyapple_out.pickle

I got the following output:

[2017-12-01 18:02:31.482712] INFO: Loader: Cache at C:\cygwin64\home\jengelman/.zipline\data\SPY_benchmark.csv does not have data from 1990-01-02 00:00:00+00:00 to 2017-11-29 00:00:00+00:00.

[2017-12-01 18:02:31.482712] INFO: Loader: Downloading benchmark data for 'SPY' from 1989-12-29 00:00:00+00:00 to 2017-11-29 00:00:00+00:00
Traceback (most recent call last):
  File "C:\Anaconda3\Scripts\zipline-script.py", line 11, in <module>
    load_entry_point('zipline', 'console_scripts', 'zipline')()
  File "C:\Anaconda3\lib\site-packages\click\core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "C:\Anaconda3\lib\site-packages\click\core.py", line 697, in main
    rv = self.invoke(ctx)
  File "C:\Anaconda3\lib\site-packages\click\core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Anaconda3\lib\site-packages\click\core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Anaconda3\lib\site-packages\click\core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "c:\anaconda3\lib\site-packages\zipline-1.1.1+152.g18e4186f-py3.6-win-amd64.egg\zipline\__main__.py", line 101, in _
    return f(*args, **kwargs)
  File "C:\Anaconda3\lib\site-packages\click\decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "c:\anaconda3\lib\site-packages\zipline-1.1.1+152.g18e4186f-py3.6-win-amd64.egg\zipline\__main__.py", line 255, in run
    environ=os.environ,
  File "c:\anaconda3\lib\site-packages\zipline-1.1.1+152.g18e4186f-py3.6-win-amd64.egg\zipline\utils\run_algo.py", line 133, in _run
    env = TradingEnvironment(asset_db_path=connstr, environ=environ)
  File "c:\anaconda3\lib\site-packages\zipline-1.1.1+152.g18e4186f-py3.6-win-amd64.egg\zipline\finance\trading.py", line 99, in __init__
    self.bm_symbol,
  File "c:\anaconda3\lib\site-packages\zipline-1.1.1+152.g18e4186f-py3.6-win-amd64.egg\zipline\data\loader.py", line 166, in load_market_data
    environ,
  File "c:\anaconda3\lib\site-packages\zipline-1.1.1+152.g18e4186f-py3.6-win-amd64.egg\zipline\data\loader.py", line 230, in ensure_benchmark_data
    last_date,
  File "c:\anaconda3\lib\site-packages\zipline-1.1.1+152.g18e4186f-py3.6-win-amd64.egg\zipline\data\benchmarks.py", line 50, in get_benchmark_returns
    last_date
  File "C:\Anaconda3\lib\site-packages\pandas_datareader\data.py", line 137, in DataReader
    session=session).read()
  File "C:\Anaconda3\lib\site-packages\pandas_datareader\base.py", line 181, in read
    params=self._get_params(self.symbols))
  File "C:\Anaconda3\lib\site-packages\pandas_datareader\base.py", line 79, in _read_one_data
    out = self._read_url_as_StringIO(url, params=params)
  File "C:\Anaconda3\lib\site-packages\pandas_datareader\base.py", line 98, in _read_url_as_StringIO
    out.write(bytes_to_str(text))
  File "C:\Anaconda3\lib\site-packages\pandas\compat\__init__.py", line 72, in bytes_to_str
    return b.decode(encoding or 'utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xae in position 35301: invalid start byte

I checked the cache, and it has data for the time period mentioned, but running the command again gives the same result. Any help would be appreciated.

freddiev4 commented 6 years ago

Hi @jengelman thanks for opening this. Can you try running this for a smaller time period? Such as 2017-1-1 to 2017-2-1

Aside from that, I believe this is coming from changes to the Google API which we use pandas-datareader to pull data from. I recently opened #2031 to remove the pandas-datareader and Google API dependency, so once that's merged this issue should go away.

jengelman commented 6 years ago

@freddiev4 Got almost the exact same error message, including the attempted redownload of the entire benchmark. Only difference was 'position 35199' instead of 'position 35301'.

freddiev4 commented 6 years ago

Are you running the latest Zipline master branch? Or at some other commit-ish or version 1.1.1?

freddiev4 commented 6 years ago

FWIW, I just ran:

zipline run -f zipline/examples/buyapple.py \
    --start 2017-1-1 \
    --end 2017-2-1 \
    -o buyapple_out.pickle \
    -b quantopian-quandl \

On latest Zipline master, and Zipline 1.1.1, in both a Py35 and Py36 environment (running macOS), and didn't see this issue

jengelman commented 6 years ago

Original issue is from Zipline 1.1.1, installed from pip.

I tried uninstalling with pip, building master using python setup.py install, and that command, and got the same error. Tried rerunning zipline ingest, and got the following (5 times before it gave up):

[2017-12-01 19:06:46.114107] ERROR: zipline.data.bundles.quandl: Exception raised reading Quandl data. Retrying.
Traceback (most recent call last):
  File "C:\Anaconda3\lib\site-packages\zipline-1.1.1+172.g42b32007-py3.6-win-amd64.egg\zipline\data\bundles\quandl.py", line 89, in fetch_data_table
    format_metadata_url(api_key)
  File "C:\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 562, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "C:\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 301, in _read
    compression=kwds.get('compression', None))
  File "C:\Anaconda3\lib\site-packages\pandas\io\common.py", line 308, in get_filepath_or_buffer
    req = _urlopen(str(filepath_or_buffer))
  File "C:\Anaconda3\lib\urllib\request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Anaconda3\lib\urllib\request.py", line 532, in open
    response = meth(req, response)
  File "C:\Anaconda3\lib\urllib\request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Anaconda3\lib\urllib\request.py", line 570, in error
    return self._call_chain(*args)
  File "C:\Anaconda3\lib\urllib\request.py", line 504, in _call_chain
    result = func(*args)
  File "C:\Anaconda3\lib\urllib\request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
[2017-12-01 19:06:46.116107] INFO: zipline.data.bundles.quandl: Downloading WIKI metadata.
[2017-12-01 19:06:46.249107] ERROR: zipline.data.bundles.quandl: Exception raised reading Quandl data. Retrying.
freddiev4 commented 6 years ago

Installing Zipline: So installing zipline via pip is slightly more involved than the average Python package, which you can read more about in the Install Guide.

If you were to use conda, which is the recommended package manager for installing zipline on Windows, you can install the latest zipline master using conda install -c quantopian/label/ci zipline.

Ingesting Data: In regards to zipline ingest, the default data bundle is the quandl bundle (was formerly quantopian-quandl, docs should be updated in the next release), in which you need a quandl API key to properly ingest, seen here in the Data Bundles Guide.

You can also run zipline ingest -b quantopian-quandl, which is a mirror of the quandl WIKI dataset that Quantopian provides, with the data in the formats that zipline expects. You don't need to pass the API key arg for that.

Python Versions: We only build conda packages for Py27 and Py35, and don't yet have support for Py36.

jengelman commented 6 years ago

Following the instructions, I tried installing using conda, and got the following conflict:

$ conda install -c Quantopian zipline

Fetching package metadata ...............
Solving package specifications: .

UnsatisfiableError: The following specifications were found to be in conflict:
  - python 3.6*
  - zipline -> bcolz >=0.12.1,<1 -> numpy 1.10* -> python 2.7*

Does zipline not support python 3.6, or is there a way to change the required version of numpy here?

freddiev4 commented 6 years ago

Good question (edited my previous answer above). We don't yet have conda packages built for Python 3.6; if you create a Python 3.5 conda env it should work.

jengelman commented 6 years ago

I created a py3.5 conda env, successfully installed zipline from conda, and got the original error "can't decode btye" error again.

edit: tried running both the jupyter notebook and example script directly from python, and got "No module named zipline" for both, so guess installation wasn't successful, despite being able to call zipline from the command line. might be a path issue somewhere?

freddiev4 commented 6 years ago

Interesting. So if you've installed via conda, in a Py35 env, and the latest master doesn't work for you and zipline 1.1.1 doesn't work, then I'm thinking #2031 should be the fix. However, I'm still able to run the command I pasted earlier without any issues; so it's a bit strange. Will have to do some more inspection.

If you'd like to test out that #2031 branch, there are Development Guidelines on how to get everything set up.

Otherwise, I'm thinking just wait for that PR to get merged. I don't have access to a Windows machine currently but I also don't think it has anything to do with that b/c there's #2029 (which looks like they're using a non-Windows OS).

freddiev4 commented 6 years ago

It's possible there's a path issue, after seeing your edit. Not entirely certain of that.

jengelman commented 6 years ago

Built the #2031 branch and running zipline from the command line worked. Tried importing zipline within python, and got this:

No module named 'zipline.utils.calendars._calendar_helpers

freddiev4 commented 6 years ago

We recently did a release of zipline. You can see the release notes here Feel free to update to 1.2.0 with either:

pip install -U zipline

or

conda update zipline -c quantopian

If there are any problems, please reopen this or open a new issue 🙂