quantopian / zipline

Zipline, a Pythonic Algorithmic Trading Library
https://www.zipline.io
Apache License 2.0
17.72k stars 4.74k forks source link

ValueError: Failed to find any assets with country_code 'US' (temp. solved) #2517

Open mowoe opened 5 years ago

mowoe commented 5 years ago

Dear Zipline Maintainers,

Before I tell you about my issue, let me describe my environment:

Environment

* Operating System: `Linux 5.1.15-arch1-1-ARCH #1 SMP PREEMPT Tue Jun 25 04:49:39 UTC 2019 x86_64 GNU/Linux` * Python Version: `2.7` * Python Bitness: `64` * How did you install Zipline: `pip` * Python packages: `appdirs==1.4.3 asn1crypto==0.24.0 backports.functools-lru-cache==1.5 bcolz==0.12.1 Bottleneck==1.2.1 bzr==2.7.0 CacheControl==0.12.5 ceph-detect-init==1.0.1 ceph-disk==1.0.0 ceph-volume==1.0.0 cephfs==2.0.0 certifi==2019.6.16 cffi==1.12.3 chardet==3.0.4 Cheetah==2.4.4 Click==7.0 colorama==0.4.1 contextlib2==0.5.5 cryptography==2.7 cycler==0.10.0 cyordereddict==1.0.0 Cython==0.29.13 decorator==4.4.0 distlib==0.2.9 distro==1.4.0 dlib==19.16.0 empyrical==0.5.0 enum34==1.1.6 face-recognition==1.2.3 face-recognition-models==0.3.0 Flask==1.0.2 Flask-OAuth==0.12 funcsigs==1.0.2 future==0.17.1 gWakeOnLAN==0.6.3 h5py==2.9.0 html5lib==1.0.1 httplib2==0.12.1 idna==2.8 inflection==0.3.1 intervaltree==3.0.2 ipaddress==1.0.22 ipcalc==1.99.0 iso3166==1.0 iso8601==0.1.12 itsdangerous==1.1.0 Jinja2==2.10 kiwisolver==1.1.0 lockfile==0.12.2 Logbook==1.4.3 louis==3.10.0 lru-dict==1.1.6 lxml==4.3.4 Mako==1.0.14 Markdown==3.1.1 MarkupSafe==1.1.0 matplotlib==2.2.4 mock==3.0.5 more-itertools==5.0.0 msgpack==0.6.1 multipledispatch==0.6.0 mysql-connector-python==8.0.15 MySQL-python==1.2.5 ndg-httpsclient==0.5.1 netsnmp-python==1.0a1 networkx==1.11 numarray==1.5.2 numexpr==2.6.9 numpy==1.16.4 oauth2==1.9.0.post1 packaging==19.0 pandas==0.22.0 pandas-datareader==0.7.4 paranoid==1.1.1 patsy==0.5.1 pep517==0.5.0 pexpect==4.7.0 Pillow==5.3.0 progress==1.5 ptyprocess==0.6.0 pyasn1==0.4.6 PyAutoGUI==0.9.42 pycairo==1.18.1 pycparser==2.19 pycryptodome==3.8.2 PyGetWindow==0.0.4 PyGObject==3.32.2 PyMsgBox==1.0.6 PyOpenGL==3.1.0 pyOpenSSL==19.0.0 pyparsing==2.4.0 PyQt4-sip==4.19.16 PyQt5==5.12.3 PyQt5-sip==4.19.17 PyRect==0.1.4 PyScreeze==0.1.20 pyserial==3.4 python-dateutil==2.8.0 python-editor==1.0.4 python-interface==1.5.1 pytoml==0.1.20 PyTweening==1.0.3 pytz==2019.2 pyxdg==0.26 PyYAML==3.13 Quandl==3.4.8 rados==2.0.0 rbd==2.0.0 requests==2.22.0 requests-file==1.4.3 retrying==1.3.3 rgw==2.0.0 scikit-learn==0.20.4 scipy==1.2.1 serial==0.0.70 sip==4.19.17 six==1.12.0 sortedcontainers==2.1.0 SQLAlchemy==1.3.6 statsmodels==0.10.1 subprocess32==3.5.4 tables==3.5.2 team==1.0 toolz==0.10.0 trading-calendars==1.8.1 typing==3.7.4 urllib3==1.25.3 virtualenv==16.1.0 webencodings==0.5.1 Werkzeug==0.14.1 wrapt==1.11.2 wxPython==3.0.2.0 wxPython-common==3.0.2.0 zenmap==7.70 zipline==1.3.0+366.g950f7b28.dirty `

Now that you know a little about me, let me tell you about the issue I am having:

Description of Issue

ValueError: Failed to find any assets with country_code 'US' that traded between 2018-01-02 00:00:00+00:00 and 2018-01-09 00:00:00+00:00.This probably means that your asset db is old or that it has incorrect country/exchange metadata.

Here is how you can reproduce this issue on your machine:

Reproduction Steps

  1. Ingest Quandl Data.
  2. Run the pipeline example.

What steps have you taken to resolve this already?

The error is caused because the country_code in the Database is somehow messed up.

  1. This can be resolved by changing the SQLite Database (normally under ~/.zipline/data/quandl/TIMESTAMP/assets-n.sqlite): 1.2. Go to the 'exchanges' Table and change country_code to 'US'. Mine was set to '???' 1.3. I used 'DB Browser for SQLite' which has an easy GUI Interface just for the record.

Anything else?

I don't really know if it was necessary to open this issue, as i already resolved it, but it had driven me crazy, so if anyone else has this Problem, here is a temporary solution.

Sincerely, mowoe

JustinGuese commented 4 years ago

Still the same error for me...

rcardinaux commented 4 years ago

Also encounter the same issue since I started using Zipline. I am setting manually the country code to 'US' as suggest above every time I ingest new data, but I'm pretty sure it's not the normal procedure.

Has anyone a better solution? could this be worked around by tweaking the ingest function?

rcardinaux commented 4 years ago

If it can help anyone, I solved the issue by feeding the exchanges table directly in the ingest function. It looks like that:

exchange = {'exchange': 'NYSE', 'canonical_name': 'NYSE', 'country_code': 'US'}
exchange_df = pd.DataFrame(exchange,index = [0])
asset_db_writer.write(equities=asset_metadata,exchanges=exchange_df)
chunghoony commented 4 years ago

@rcardinaux can you expand your solution a bit? where can I find the ingest function?

rcardinaux commented 4 years ago

@chunghoony : you can create your own ingest function and save it in the bundle folder of zipline. The path to that folder is the following in my environment: C:\Users\rapha\anaconda3\envs\py35\Lib\site-packages\zipline\data\bundles

The script where you define the ingest function is ultimately run whith the following command: zipline ingest --bundle custom_data

The links below have been useful in writing my script custom_data: https://0xboz.github.io/blog/how-to-create-custom-zipline-bundles-from-binance-data-part-1/ https://0xboz.github.io/blog/how-to-create-custom-zipline-bundles-from-binance-data-part-2/

You won't be able to use my code as such, since I'm exporting data from an SQL database, but here is the content of my custom_data.py file:

import sys
sys.path.insert(0, 'C:/Users/rapha/OneDrive/Dokumente/GitHub/data/python_code')

#PROJECT'S PACAKGES
import pck_param as param
import pck_data_load as data_load
import pck_interface as interface

import pandas as pd
import numpy as np
#from pathlib import Path
from six import iteritems
from logbook import Logger
#from zipfile import ZipFile

show_progress = True
log = Logger(__name__)

#FORMAT THE DATA --> impport data from a zipfile and set the correct columns' names
def load_data_table(show_progress=show_progress):

    db_info, data_source_info = param.parse_credentials()

    sql_conn, sql_cursor = data_load.mysql_connect(
        host=db_info['host'],
        port=db_info['port'],
        user=db_info['user'],
        password=db_info['password'],
        database=db_info['database'])

    data=interface.SQL_to_python_data_loader(sql_conn=sql_conn,
                                              sql_cursor=sql_cursor,
                                              security_symbol_1='IBM',
                                              security_symbol_3='AACG',
                                              perioodicityID='D')

    if show_progress:
        log.info(data.info())
        log.info(data.head())

    return data

#CREATE THE METADATA --> The metadata table provides the “mapping table” for our securities list
def gen_asset_metadata(data, show_progress):
    if show_progress:
        log.info('Generating asset metadata.')
    data = data.groupby(
        by='symbol'
    ).agg(
        {'date': [np.min, np.max]}
    )
    data.reset_index(inplace=True)
    data['start_date'] = data.date.amin
    data['end_date'] = data.date.amax
    #RC: add exchange
    data['exchange'] = 'NYSE'
    del data['date']
    data.columns = data.columns.get_level_values(0)

    data['auto_close_date'] = data['end_date'].values
    if show_progress:
        log.info(data.info())
        log.info(data.head())
    return data

#STORE THE ADJUSTMENTS
def parse_splits(data, show_progress):
    if show_progress:
        log.info('Parsing split data.')
    data['split_ratio'] = 1.0 / data.split_ratio
    data.rename(
        columns={
            'split_ratio': 'ratio',
            'date': 'effective_date',
        },
        inplace=True,
        copy=False,
    )
    if show_progress:
        log.info(data.info())
        log.info(data.head())
    return data

def parse_dividends(data, show_progress):
    if show_progress:
        log.info('Parsing dividend data.')
    data['record_date'] = data['declared_date'] = data['pay_date'] = pd.NaT
    data.rename(columns={'date': 'ex_date',
                         'dividends': 'amount'}, inplace=True, copy=False)
    if show_progress:
        log.info(data.info())
        log.info(data.head())
    return data

#WRITE THE DAILY BARS 
def parse_pricing_and_vol(data,
                          sessions,
                          symbol_map):
    for asset_id, symbol in iteritems(symbol_map):
        asset_data = data.xs(
            symbol,
            level=1
        ).reindex(
            sessions.tz_localize(None)
        ).fillna(0.0)
        yield asset_id, asset_data
#--------------------

def ingest(environ,
           asset_db_writer,
           minute_bar_writer,
           daily_bar_writer,
           adjustment_writer,
           calendar,
           start_session,
           end_session,
           cache,
           show_progress,
           output_dir):
    raw_data = load_data_table(show_progress=show_progress)
    #raw_data = load_data_table(path, show_progress=show_progress)
    asset_metadata = gen_asset_metadata(
        raw_data[['symbol', 'date']],
        show_progress
    )

    exchange = {'exchange': 'NYSE', 'canonical_name': 'NYSE', 'country_code': 'US'}
    exchange_df = pd.DataFrame(exchange,index = [0])
    asset_db_writer.write(equities=asset_metadata,exchanges=exchange_df)

    #WRITE THE DAILY BARS 
    symbol_map = asset_metadata.symbol
    sessions = calendar.sessions_in_range(start_session, end_session)
    raw_data.set_index(['date', 'symbol'], inplace=True)
    daily_bar_writer.write(
        parse_pricing_and_vol(
            raw_data,
            sessions,
            symbol_map
        ),
        show_progress=show_progress
    )
    #STORE THE ADJUSTMENTS
    raw_data.reset_index(inplace=True)
    raw_data['symbol'] = raw_data['symbol'].astype('category')
    raw_data['sid'] = raw_data.symbol.cat.codes
    adjustment_writer.write(
        splits=parse_splits(
            raw_data[[
                'sid',
                'date',
                'split_ratio',
            ]].loc[raw_data.split_ratio != 1],
            show_progress=show_progress
        ),
        dividends=parse_dividends(
            raw_data[[
                'sid',
                'date',
                'dividends',
            ]].loc[raw_data.dividends != 0],
            show_progress=show_progress
        )
    )
evrenbingol commented 4 years ago

This worked or me. Make sure you save the database instead of closing the app(apply does not perm save it)

hosammhmahmoud commented 1 year ago

I know I am commenting so late but this fixed the issue for me. For the zipline/assets.assets.py file there is a function called _compute_asset_lifetimes which responsible for making sure the asset has exchanges and names, you can comment out this condition and everything should work fine.

        with self.engine.connect() as conn:
            results = conn.execute(
                sa.select(
                    equities_cols.sid,
                    equities_cols.start_date,
                    equities_cols.end_date,
                    # ).where( # Comment Conditions
                    #     (exchanges_cols.exchange == equities_cols.exchange) #& (condt)
                )
            ).fetchall()
        if results:
            sids, starts, ends = zip(*results)