Auquan / quant-quest-2

Quant Quest 2
3 stars 47 forks source link

Official Page for Quant Quest hosted by Auquan.

IMPORTANT

We've made changes to the prediction function. Please read the new documentation

Trading Problem Overview

This problem requires a mix of statistics and data analysis skills to create a predictive model using financial data. We will provide you with a toolbox and historical data to develop and test your strategy for the competition.

  1. Getting Started
  2. How does the toolbox work?
  3. Available Feature Guide

Quick Startup Guide

Install Python and dependent packages

You need Python 2.7 (Python 3 will be supported later) to run this toolbox. For an easy installation process, we recommend Anaconda since it will reliably install all the necessary dependencies. Download Anaconda and follow the instructions on the installation page. Once you have Python, you can then install the toolbox.

Get the Quant Quest Toolbox

There are multiple ways to install the toolbox for the competition.

The easiest way and the most recommended way is via pip. Just run the following command: pip install -U auquan_toolbox If we publish any updates to the toolbox, the same command pip install -U auquan_toolbox will also automatically get the new version of the toolbox.

Note: Mac users, if you face any issues with installation, try using 'pip install --user auquan_toolbox'

Download Problem1.py

Run the following command to make sure everything is setup properly

    python problem1.py

Make your changes

Use problem1.py as a template which contains skeleton functions (with explanation) that need to be filled in to create your own trading strategy. You need to fill in the getFairValue() function for problem 1. For problem 2, fill in the getClassifierProbability() function for problem 2.

How does the toolbox work?

Getting Data

The data for the competition is provided here. The toolbox auto-downloads and loads the data for you. You can specify the training dataset you want to load in getTrainingDataSet() function.

def getTrainingDataSet(self):
        return "sampleData"
        # Set this to trainingData1 or trainingData2 or trainingData3

You can specify the instruments to load in function getSymbolsToTrade(). If you return an empty array, it downloads all the stocks.

    def getSymbolsToTrade(self):
        return []

You then need to create features and combine them in the prediction function to generate your predictions.

Features and predictions are explained below. The toolbox also provides extensive functionality and customization. While not required for the competition,you can read more about the toolbox here

Creating Features

Fill in the features you want to use in getFeatureConfigDicts() function. Features are called by specifying config dictionaries. Create one dictionary per feature and return them in a dictionary.

Feature config Dictionary has the following keys:

featureId: a string representing the type of feature you want to use
featureKey: {optional} a string representing the key you will use to access the value of this feature
If not present, will just use featureId
params: {optional} A dictionary with which contains other optional params if needed by the feature

Example: If you only want to use the moving_sum feature, your getFeatureConfigDicts() function should be:

  def getFeatureConfigDicts(self):
        msDict = {'featureKey': 'ms_5',
                'featureId': 'moving_sum',
                'params': {'period': 5,
                'featureName': 'basis'}}
        return [msDict]

You can now use this feature by calling it's featureKey, 'ms_5'
Full list of features with featureId and params is available here.

Custom Features To use your own custom features, follow the example of class MyCustomFeature() in problem1.py. Specifically, you'll have to:

  1. create a new class for the feature and implement your logic in the function computeForInstrument() - you can copy the class from MyCustomFeature() Example:

    class MyCustomFeatureClassName(Feature):
    @classmethod
    def computeForInstrument(cls, featureParams, featureKey, currentFeatures, instrument, instrumentManager):
        return 5
  2. modify function getCustomFeatures() to return a dictionary with Id for this class (follow formats like {'my_custom_feature_identifier': MyCustomFeatureClassName}. Make sure 'my_custom_feature_identifier' doesnt conflict with any of the pre defined feature Ids

    def getCustomFeatures(self):
        return {'my_custom_feature_identifier': MyCustomFeatureClassName}
  3. create a dict for this feature in getFeatureConfigDicts(). Dict format is:

    customFeatureDict = {'featureKey': 'my_custom_feature_key',
                         'featureId': 'my_custom_feature_identifier',
                          'params': {'param1': 'value1'}}

    You can now use this feature by calling it's featureKey, 'my_custom_feature_key'

Instrument features are calculated per instrument (for example position, fees, moving average of instrument price). The toolbox auto-loops through all intruments to calculate features for you.

IMPORTANT: We've made changes to this function, please make sure to change your file accordingly

Prediction Function

Combine all the features to create the desired prediction function. For problem 1, fill the funtion getFairValue() to return the predicted FairValue(expected average of future values). Here you can call your previously created features by referencing their featureId. For example, I can call my moving sum and custom feature as:

    def getFairValue(self, updateNum, time, instrumentManager):
        # holder for all the instrument features
        lookbackInstrumentFeatures = instrumentManager.getLookbackInstrumentFeatures()

        # dataframe for a historical instrument feature (ms_5 in this case). The index is the timestamps
        # atmost upto lookback data points. The columns of this dataframe are the stock symbols/instrumentIds.
        ms5Data = lookbackInstrumentFeatures.getFeatureDf('ms_5')

        # Returns a series with index as all the instrumentIds. This returns the value of the feature at the last
        # time update.
        ms5 = ms5Data.iloc[-1]

        return ms5

Important: Previously, we were calling lookbackInstrumentFeatures = instrument.getDataDf(), which returned the holder for all instrument feature and then lookbackInstrumentFeatures['ms_5'] which returns a dataFrame for that feature for one stock. Now we first call the holder for all the instrument features as lookbackInstrumentFeatures = instrumentManager.getLookbackInstrumentFeatures() and then dataframe for the feature as lookbackInstrumentFeatures.getFeatureDf('ms_5') which returns a dataFrame for that feature for ALL stocks at the same time. Rest of the code is same.**

Output of the prediction function is used by the toolbox to make further trading decisions and evaluate your score.

Available Feature Guide

Features can be called by specifying config dictionaries. Create one dictionary per feature and return them in a dictionary as market features or instrument features.

Feature config Dictionary has the following keys:

featureId: a string representing the type of feature you want to use
featureKey: {optional} a string representing the key you will use to access the value of this feature
If not present, will just use featureId
params: {optional} A dictionary with which contains other optional params if needed by the feature

Code Snippets for all the features are available here

Feature ID Parameters Description
moving_average 'featureName', 'period' calculate rolling average of featureName over period
moving_correlation 'period', 'series1', 'series2' calculate rolling correlation of series1 and series2 over period
moving_max 'featureName', 'period' calculate rolling max of featureName over period
moving_min 'featureName', 'period' calculate rolling min of featureName over period
moving_sdev 'featureName', 'period' calculate moving standard deviation of featureName over period
moving_sum 'featureName', 'period' calculate moving sum of featureName over period
exponential_moving_average 'featureName', 'period' calculate exp. weighted moving average of featureName with period as half life
argmax 'featureName', 'period' Returns the index where featureName is maximum over period
argmin 'featureName', 'period' Returns the index where featureName is minimum over period
delay 'featureName', 'period' Returns the value of featureName with a delay of period
difference 'featureName', 'period' Returns the difference of featureName with it's value period before
rank 'featureName', 'period' Ranks last period values of featureName on a scale of 0 to 1
scale 'featureName', 'period', 'scale' Resale last period values of featureName on a scale of 0 to scale
ratio 'featureName', 'instrumentId1', 'instrumentId2' ratio of feature values of instrumentID1 / instrumentID2
momentum 'featureName', 'period' calculate momentum in featureName over period as (featureValue(now) - featureValue(now - period))/featureValue * 100
bollinger_bands 'featureName', 'period' DEPRECATED, use bollinger_bands_lower, bollinger_bands_upper as below
bollinger_bands_lower 'featureName', 'period' lower bollinger bands as average(period) - sdev(period)
bollinger_bands_upper 'featureName', 'period' upper bollinger bands as average(period) + sdev(period)
cross_sectional_momentum 'featureName', 'period', 'instrumentIds' Returns Cross-Section Momentum of 'instrumentIds' in featureName over period
macd 'featureName', 'period1', 'period2' moving average convergence divergence as average(period1) - average(period2)
rsi 'featureName', 'period' Relative Strength Index - ratio of average profits / average losses over period
vwap - calculated from book data as bid price x ask volume + ask price x bid volume / (ask volume + bid volume)
fees - fees to trade, always calculated
position - instrument position, always calculated
pnl - Profit/Loss, always calculated
capital - Spare capital not in use, always calculated
portfolio_value - Total value of trading system, always calculated