sanjeevai / multi-factor-model

Build a statistical risk model using PCA. Optimize the portfolio using the risk model and factors using multiple optimization formulations.
117 stars 36 forks source link
alpha alpha-factors factor-model factor-returns momentum-factor portfolio reversion risk-model sharpe-ratio

Artificial Intelligence for Trading Nanodegree

Alpha Research and Factor Modelling

Project: Multi-Factor Model

Table of Contents

  1. Project Overview
  2. Data
  3. Statistical Risk Model
  4. Alpha Factors
    1. Momentum 1 Year Factor
    2. Mean Reversion 5 Day Sector Neutral Factor
    3. Mean Reversion 5 Day Sector Neutral Smoothed Factor
    4. Overnight Sentiment Factor
    5. Overnight Sentiment Factor Smoothed
  5. The Combined Alpha Factor
  6. Evaluate Alpha Factors
  7. Optimal Portfolio Constrained by Risk Model
    1. Objective and Constraints
    2. Optimize with a Regularization Parameter
    3. Optimize with a Strict Factor Constraints and Target Weighting
  8. Libraries
  9. References


Project Overview

In this project, I will build a statistical risk model using PCA. I’ll use this model to build a portfolio along with 5 alpha factors. I’ll create these factors, then evaluate them using factor-weighted returns, quantile analysis, sharpe ratio, and turnover analysis. At the end of the project, I’ll optimize the portfolio using the risk model and factors using multiple optimization formulations.

Data

For the dataset, we'll be using the end of day from Quotemedia and sector data from Sharadar.

Udacity doesn't have a license to redistribute the data to us. They are working on alternatives to this problem.

Statistical Risk Model

Portfolio risk is calculated using this formula:

portfolio risk

where:

Alpha Factors

After calculating the profile risk, the following five alpha factors were created:

Each factor has a hypothesis that goes with it. For this factor, it is "Higher past 12-month (252 days) returns are proportional to future return". Using that hypothesis, we've generate this code:

from zipline.pipeline.factors import Returns

def momentum_1yr(window_length, universe, sector):
    return Returns(window_length=window_length, mask=universe) \
        .demean(groupby=sector) \
        .rank() \
        .zscore()

I have implemented mean_reversion_5day_sector_neutral using the hypothesis "Short-term outperformers (underperformers) compared to their sector will revert." Using the returns data from universe, demean using the sector data to partition, rank, then converted to a zscore.

def mean_reversion_5day_sector_neutral(window_length, universe, sector):
    """
    Generate the mean reversion 5 day sector neutral factor

    Parameters
    ----------
    window_length : int
        Returns window length
    universe : Zipline Filter
        Universe of stocks filter
    sector : Zipline Classifier
        Sector classifier

    Returns
    -------
    factor : Zipline Factor
        Mean reversion 5 day sector neutral factor
    """

    return -Returns(window_length=window_length, mask = universe)\
                    .demean(groupby=sector)\
                    .rank()\
                    .zscore()

Taking the output of the previous factor, we create a smoothed version. mean_reversion_5day_sector_neutral_smoothed generates a mean reversion 5 day sector neutral smoothed factor. Calling the mean_reversion_5day_sector_neutral function to get the unsmoothed factor, then using SimpleMovingAverage function to smooth it. We'll have to apply rank and zscore again.

from zipline.pipeline.factors import SimpleMovingAverage

def mean_reversion_5day_sector_neutral_smoothed(window_length, universe, sector):
    """
    Generate the mean reversion 5 day sector neutral smoothed factor

    Parameters
    ----------
    window_length : int
        Returns window length
    universe : Zipline Filter
        Universe of stocks filter
    sector : Zipline Classifier
        Sector classifier

    Returns
    -------
    factor : Zipline Factor
        Mean reversion 5 day sector neutral smoothed factor
    """

    mean_reversion = mean_reversion_5day_sector_neutral(window_length, universe, sector)

    return SimpleMovingAverage(inputs=[mean_reversion], window_length = window_length).rank().zscore()

For this factor, were using the hypothesis from the paper Overnight Returns and Firm-Specific Investor Sentiment.

from zipline.pipeline.data import USEquityPricing

class CTO(Returns):
    """
    Computes the overnight return, per hypothesis from
    https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2554010
    """
    inputs = [USEquityPricing.open, USEquityPricing.close]

    def compute(self, today, assets, out, opens, closes):
        """
        The opens and closes matrix is 2 rows x N assets, with the most recent at the bottom.
        As such, opens[-1] is the most recent open, and closes[0] is the earlier close
        """
        out[:] = (opens[-1] - closes[0]) / closes[0]

class TrailingOvernightReturns(Returns):
    """
    Sum of trailing 1m O/N returns
    """
    window_safe = True

    def compute(self, today, asset_ids, out, cto):
        out[:] = np.nansum(cto, axis=0)

def overnight_sentiment(cto_window_length, trail_overnight_returns_window_length, universe):
    cto_out = CTO(mask=universe, window_length=cto_window_length)
    return TrailingOvernightReturns(
         inputs=[cto_out],window_length=trail_overnight_returns_window_length
         )\
         .rank().zscore()

Just like the implemented factor, we'll also smooth this factor.

def overnight_sentiment_smoothed(cto_window_length, trail_overnight_returns_window_length, universe):

    unsmoothed_factor = overnight_sentiment(cto_window_length, trail_overnight_returns_window_length, universe)

    return SimpleMovingAverage(
            inputs=[unsmoothed_factor], window_length=trail_overnight_returns_window_length
            ) \
            .rank() \
            .zscore()

Combined Alpha Factor

With all the factor implementations done, let's add them to a zipline pipeline.

universe = AverageDollarVolume(window_length=120).top(500)
sector = project_helper.Sector()

pipeline = Pipeline(screen=universe)
pipeline.add(
    momentum_1yr(252, universe, sector),
    'Momentum_1YR')
pipeline.add(
    mean_reversion_5day_sector_neutral(5, universe, sector),
    'Mean_Reversion_5Day_Sector_Neutral')
pipeline.add(
    mean_reversion_5day_sector_neutral_smoothed(5, universe, sector),
    'Mean_Reversion_5Day_Sector_Neutral_Smoothed')
pipeline.add(
    overnight_sentiment(2, 5, universe),
    'Overnight_Sentiment')
pipeline.add(
    overnight_sentiment_smoothed(2, 5, universe),
    'Overnight_Sentiment_Smoothed')
all_factors = engine.run_pipeline(pipeline, factor_start_date, universe_end_date)
# all_factors.head()

Evaluate Alpha Factors

Note: We're evaluating the alpha factors using delay of 1

Quantile Analysis

Let's view the factor returns over time. It looks like moving up and to the right.

factor_weighted_rets

It is not enough to look just at the factor weighted return. A good alpha is also monotonic in quantiles. Let's looks the basis points for the factor returns.

quantile_res

Observations:

Turnover Analysis

Without doing a full and formal backtest, we can analyze how stable the alphas are over time. Stability in this sense means that from period to period, the alpha ranks do not change much. Since trading is costly, we always prefer, all other things being equal, that the ranks do not change significantly per period. We can measure this with the factor rank autocorrelation (FRA).

turnover_analysis

Sharpe Ratio of the Alphas

The last analysis we'll do on the factors will be sharpe ratio. Function sharpe_ratio calculate the sharpe ratio of factor returns.

def sharpe_ratio(factor_returns, annualization_factor):
    """
    Get the sharpe ratio for each factor for the entire period

    Parameters
    ----------
    factor_returns : DataFrame
        Factor returns for each factor and date
    annualization_factor: float
        Annualization Factor

    Returns
    -------
    sharpe_ratio : Pandas Series of floats
        Sharpe ratio
    """

    return annualization_factor * np.mean(factor_returns)/np.std(factor_returns, ddof=1)

Let's see what the sharpe ratio for the factors are. Generally, a Sharpe Ratio of near 1.0 or higher is an acceptable single alpha for this universe.

sharpe

Observation:

Sharpe Ratio of 1.13 for momentum factor is good but if we look at the auto-correlation plots, FRA for momentum factor looks stable. So smoothing the momentum factor will not have any significant change.

The Combined Alpha Vector

To use these alphas in a portfolio, we need to combine them somehow so we get a single score per stock. This is a area where machine learning can be very helpful. In this module, however, we will take the simplest approach of combination: simply averaging the scores from each alpha.

Optimal Portfolio Constrained by Risk Model

Objective and Constraints

This is the list of contraints that will optimize against:

objective_constraint

Where x is the portfolio weights, B is the factor betas, and r is the portfolio risk

The first constraint is that the predicted risk be less than some maximum limit. The second and third constraints are on the maximum and minimum portfolio factor exposures. The fourth constraint is the "market neutral constraint: the sum of the weights must be zero. The fifth constraint is the leverage constraint: the sum of the absolute value of the weights must be less than or equal to 1.0. The last are some minimum and maximum limits on individual holdings.

Weights generated after applying those constraints:

portfolio_holdings_by_stock

Yikes. It put most of the weight in a few stocks.

portfolio_net_factor_exp

Optimize with a Regularization Parameter

This is the weights distribution after applying regularization to the objective function.

portfolio_holdings_by_stocks_reg

Nice. Well diverfied.

portfolio_net_factor_exp_reg

Optimize with a Strict Factor Constraints and Target Weighting

Another common formulation is to take a predefined target weighting(e.g., a quantile portfolio), and solve to get as close to that portfolio while respecting portfolio-level constraints.

portfolio_holdings_by_stocks_strict

portfolio_net_factor_exp_strict

Libraries

This project used Python 3.6.3. The necessary libraries are mentioned in requirements.txt:

References

  1. Overnight Returns and Firm-Specific Investor Sentiment

  1. The Formation Process of Winners and Losers in Momentum Investing

  2. Expected Skewness and Momentum

  3. Arbitrage Asymmetry and the Idiosyncratic Volatility Puzzle