alphaville76 / sharadar_db_bundle

10 stars 8 forks source link

strange behavior in long only strategy #18

Closed vaa1234 closed 3 years ago

vaa1234 commented 3 years ago

Hi!

I found strange behavior when testing a simple long only strategy. When examining the results of the backtest, there are short sales that should not be.

Here's an example of a strategy. For simplicity, the universe is limited to a fixed set of stocks.

import pandas as pd
import zipline.api as algo
from zipline.pipeline import Pipeline
from zipline.pipeline.factors import AverageDollarVolume, Returns
from zipline.finance.execution import MarketOrder
from zipline.pipeline.filters import StaticAssets
from sharadar.pipeline.engine import symbols, sid, sids, symbol
from sharadar.util.run_algo import run_algorithm

def initialize(context):
    algo.attach_pipeline(make_pipeline(), 'pipeline')

    algo.schedule_function(
        rebalance,
        algo.date_rules.every_day(),
        algo.time_rules.market_close(minutes=30),
    )

def make_pipeline():

    russell_universe = StaticAssets(symbols(['AAPL', 'AA', 'KKD', 'MON', 'SPY', 'XOM', 'JNJ', 'HD', 'MSFT']))

    filt = AverageDollarVolume(window_length=30, mask=russell_universe) > 10e6

    pipeline = Pipeline(
        columns={
            "1y_returns": Returns(window_length=252),
        },
        screen=filt
    )
    return pipeline

def before_trading_start(context, data):

    factors = algo.pipeline_output('pipeline')

    returns = factors["1y_returns"].sort_values(ascending=False)
    context.winners = returns.index[:3]

def rebalance(context, data):
    algo.record(aapl_price=data.current(symbol('AAPL'), "price"))

    # calculate intraday returns for our winners
    current_prices = data.current(context.winners, "price")
    prior_closes = data.history(context.winners, "close", 2, "1d").iloc[0]
    intraday_returns = (current_prices - prior_closes) / prior_closes

    positions = context.portfolio.positions

    # Exit positions we no longer want to hold
    for asset, position in positions.items():
        if asset not in context.winners:
            algo.order_target_value(asset, 0, style=MarketOrder())

    # Enter long positions
    for asset in context.winners:

        # if already long, nothing to do
        if asset in positions:
            continue

        # if the stock is up for the day, don't enter
        if intraday_returns[asset] > 0:
            continue

        # otherwise, buy a fixed $100K position per asset
        algo.order_target_value(asset, 100e3, style=MarketOrder())

result = run_algorithm(
    start = pd.Timestamp("2014-01-01", tz='utc'),
    end = pd.Timestamp("2019-12-15", tz='utc'),
    initialize=initialize, # Define startup function
    before_trading_start=before_trading_start,
    capital_base=1000000, # Set initial capital
    data_frequency = 'daily', # Set data frequency
)
import pyfolio as pf
rets, positions, transactions = pf.utils.extract_rets_pos_txn_from_zipline(result)
pf.create_round_trip_tear_sheet(rets, positions, transactions)

If you look at round trips report, we will see there that there were 34 short sales that shouldn't be.

Screenshot 2020-12-29 at 17 25 04

All of these short sales are related to AAPL.

Screenshot 2020-12-29 at 17 25 15

At first I thought it was a mistake in the pyfolio report, but then I found out that the problem is in AAPL splits.

If you load the price data initially adjusted for dividends and splits into the sharadar_db_bundle, then everything will work as it should.

If the sharadar-bundle makes adjustments ourselves, then this problem arises.

Help to understand this problem. Whether it occurs due to incorrect adjustment calculation in sharadar_db_bundle or are there some errors in quandl prices or something else.

Sincerely, Alexander

alphaville76 commented 3 years ago

Hi Alexander,

thanks for the issue! I could exactly reproduce the problem, but hadn't yet the time to debug it.

The sharadar-bundle saves the unadjusted price data into the prices.sqlite db and then dividends and splits separately into adjustments.sqlite (the files are under ~/.zipline/data/sharadar/latest). It's the expected zipline way and no extra adjustment is made.

The data ingestion happens in https://github.com/alphaville76/sharadar_db_bundle/blob/master/sharadar/loaders/ingest_sharadar.py and here are some entry points in code to further investigate the problem:

https://github.com/alphaville76/sharadar_db_bundle/blob/2dfef0df120b9e68a996c974c4532cdb08d87367/sharadar/loaders/ingest_sharadar.py#L38

https://github.com/alphaville76/sharadar_db_bundle/blob/2dfef0df120b9e68a996c974c4532cdb08d87367/sharadar/loaders/ingest_sharadar.py#L74

https://github.com/alphaville76/sharadar_db_bundle/blob/2dfef0df120b9e68a996c974c4532cdb08d87367/sharadar/loaders/ingest_sharadar.py#L92

https://github.com/alphaville76/sharadar_db_bundle/blob/2dfef0df120b9e68a996c974c4532cdb08d87367/sharadar/loaders/ingest_sharadar.py#L107

alphaville76 commented 3 years ago

I'm sure it's a bug in pyfolio. I printed every day the portfolio positions, and there were never a day with negative amount. I used also the condition algo.set_long_only() and it also never occurred.