nictra23 / Project

0 stars 0 forks source link

Non-liquidity factor stock selection based on the shortest path of K-line #2

Open nictra23 opened 1 year ago

nictra23 commented 1 year ago

Intuitive Explanation People have a preference for liquidity; therefore, lost liquidity needs to be compensated with a price, resulting in a liquidity premium. The better the liquidity, the more active the stock's trading activity in the market. Single transaction impact on the price, bid-ask spread, and transaction costs are also smaller. Trades can also be completed relatively more quickly. Conversely, poor liquidity results in higher liquidity risk (the risk that traders may pay more transaction fees due to lack of liquidity), but from a risk-return perspective, higher risk implies higher return. If we control all risks other than liquidity, stocks with poorer liquidity should yield higher future returns. Thus, we introduce a non-liquidity factor: the higher the ILLIQ, the poorer the liquidity, and vice versa. We analyze stocks with high ILLIQ values here.

Detailed Description (Calculation) Traditional metrics like bid-ask spread, turnover rate, and free float market cap are classic indicators to observe liquidity but are imperfect as they can't capture the price concessions made when buying and selling securities due to low liquidity. To address this issue, we try to link trading volume and price changes to reflect liquidity. Since most non-liquidity measures are hard to directly measure or obtain, non-liquidity levels are often indirectly expressed through some proxy variables. Traditional methods try to depict the market impact of stock trading through the effect of unit volume on returns. However, this definition often fails in volatile intraday markets. Therefore, we improve it using the shortest path non-liquidity factor based on K-lines.

Image (Formula): [Image Link] - Here the value represents turnover, and d represents the number of measurement days.

The shortest path is defined as: [Image Link]

The calculation formula is: ShortCut = 2*(High-Low)-|Open-Close|

Image

However, we can see that relying solely on daily frequency data makes it difficult to fully depict the stock's movement path within a day. If a single K-line can be decomposed into multiple K-lines with higher frequencies, then the shortest path of high-frequency K-lines would be closer to the complete path. Using high-frequency K-lines, such as 15-minute or 5-minute intervals, will yield higher accuracy

Code: import jqdata import pandas as pd import numpy as np import time import datetime

# Initialization function, set benchmark, etc.
def initialize(context):
    # Set the HS300 index as the benchmark
    set_benchmark('000300.XSHG')
    # Enable dynamic price adjustment (actual price)
    set_option('use_real_price', True)
    log.info('Initialization function starts to run and only runs once globally')
    set_pas()  # 1. Set strategy parameters
    set_variables()  # 2. Set intermediate variables
    # Setup the cost of transactions
    set_order_cost(OrderCost(close_tax=0.001, open_commission=0.0003, close_commission=0.0003, min_commission=5), type='stock')

    # Schedule functions
    run_daily(before_market_open, time='before_open', reference_security='000300.XSHG')
    run_daily(market_open, time='open', reference_security='000300.XSHG')
    run_daily(after_market_close, time='after_close', reference_security='000300.XSHG')

def filterPause(stockList, startDate, endDate):
    unsuspendStock = []
    for stock in stockList:
        flag = get_price(stock, start_date=startDate, end_date=endDate, fields=['paused'])
        listflag = list(flag['paused'])
        if (1.0 not in listflag) and (True not in np.isnan(listflag)):
            unsuspendStock.append(stock)
    return unsuspendStock

def set_pas():
    g.tc = 30  # Rebalance every 30 days
    g.num_stocks = 10  # Number of stocks to be selected during each rebalancing
    g.index = '000300.XSHG'  # Define the stock pool, constituents of A-share index
    g.stocks = get_index_stocks(g.index)
    g.fre = '10m'  # Intraday data frequency interval
    g.days = 5  # Number of days to take the average

def set_variables():
    g.t = 0  # Track the number of days the backtest has run
    g.if_trade = False  # Whether to trade on the current day

def calILLIQ(stock, context_dt, fre, days):
    startdate = context_dt
    enddate = context_dt + datetime.timedelta(hours=6)
    ILLIQ = []
    count = 0
    while(count < days):
        z = get_price(stock, frequency=fre, fields=['open', 'close', 'high', 'low', 'volume'], start_date=startdate, end_date=enddate)
        use = z.values
        for i in range(len(z)):
            ILLIQ.append((2 * (use[i][2] - use[i][3]) - abs(use[i][0] - use[i][1])) / use[i][4])
        startdate = startdate - datetime.timedelta(days=1)
        enddate = enddate - datetime.timedelta(days=1)
        count += 1
    flagValue = sum(ILLIQ) / days
    return flagValue

# Function to run before market open
def before_market_open(context):
    if g.t % g.tc == 0:
        g.if_trade = True  # Trade every g.tc days

# Function to run at market open
def market_open(context):
    log.info('Function run time (market_open): ' + str(context.current_dt.time()))
    if g.if_trade:
        pastDate = context.current_dt - datetime.timedelta(days=g.days)
        g.stockPool = filterPause(g.stocks, pastDate, context.current_dt)
        flagValue = {}
        for stock in g.stockPool:
            flagValue[stock] = calILLIQ(stock, context.current_dt, g.fre, g.days)
        temp = sorted(list(flagValue.items()), key=lambda x: x[1], reverse=True)
        a = pd.DataFrame(temp).head(g.num_stocks)
        g.stockList = list(a.iloc[:, 0])
        # Stocks that should be bought according to the strategy
        MS_should_buy = g.stockList
        log.info(MS_should_buy)
        # Calculate the current total assets for capital allocation
        Money = context.portfolio.portfolio_value
        # Calculate market cap weighted weights
        qz = {}
        for stock in MS_should_buy:
            q = query(valuation.market_cap).filter(valuation.code == stock)
            qz[stock] = get_fundamentals(q)['market_cap'][0]
        # Compute total market cap
        totalMarketCap = float(sum(list(qz.values())))

        if totalMarketCap != 0:
            for stock in MS_should_buy:
                MonPerStock = qz[stock] / totalMarketCap * Money
                order_target_value(stock, MonPerStock)
        else:
            pass  # Handle the case when totalMarketCap is zero

        # Stocks that can be sold
        if len(context.portfolio.positions) > 0:
            holding = context.portfolio.positions
        else:
            holding = []

        # Sell stocks that are no longer needed
        for stock in holding:
            if stock not in MS_should_buy:
                order_target_value(stock, 0)
        # Buy stocks according to allocated weights
        for stock in MS_should_buy:
            MonPerStock = qz[stock] / totalMarketCap * Money
            order_target_value(stock, MonPerStock)

    g.if_trade = False

# Function to run after market close
def after_market_close(context):
    log.info('End of the day')
    log.info('##############################################################')
nictra23 commented 1 year ago

Algorithm Experiment Setup and Results Worthiness for Reference: This factor deserves some research attention, but it's important to note that its effectiveness is highly influenced by time and the direction of the market. Therefore, thorough study and forecasting related to time and economic trends are necessary. This factor has a noticeable impact on stock selection returns (high sensitivity) and can be combined with other factors in a multi-factor strategy.

Rationale: We conducted the experiment within the time frame of January 3, 2017, to June 1, 2017. We measured the returns of stocks based on their ILLIQ (illiquidity) values, specifically looking at the top 10 highest and lowest ILLIQ values, as well as the top 20%, top 50%, and top 80% of stocks according to their ILLIQ values.

Image

Image

Image

Image

Image

Image We used the ILLIQ index factor for our investment strategy. Though a graph is not provided, the data indicates that higher ILLIQ values (i.e., lower liquidity) resulted in better portfolio returns. Specifically, the portfolio of the top 10 stocks with the highest ILLIQ values had a strategy return of 16.1%, outperforming the benchmark by 10%.

As the ILLIQ values decreased, indicating increased liquidity, the portfolio returns gradually diminished. The strategy's performance was at its worst when considering the top 50% of stocks by liquidity, resulting in a negative return of -37.34%. However, as we shifted towards stocks with even higher liquidity, the strategy's returns began to improve. The portfolio made up of the 10 stocks with the highest liquidity had a return of 8.84%, outperforming the benchmark return (5.67%) by 3.17%.

This behavior suggests that the ILLIQ index might be better represented by a quadratic equation. In other words, the strategy performs best when selecting stocks with either the highest or lowest liquidity levels. This further confirms the significant impact of this factor on stock selection returns.

To validate these findings further, we conducted additional tests for the years 2020-2021.

Image

Image

Image

Image

Image

Image

In 2020-2021, the same pattern of returns appeared in the index groupings. The yield curve followed a quadratic equation, with the pool of 10 stocks with the worst liquidity outperforming all other pools. The strategy return was 78.84%, surpassing the benchmark return by 51.32% (78.84 - 27.52). Moreover, the strategy's drawdown rate was 25%, just 1.7% higher than the pool of 10 stocks with the best liquidity. With a high Sharpe ratio of 1.6, this indicates that, in this short timeframe, the strategy's returns were significantly higher relative to its risk. Through observations across different time frames, we find that this factor has a significant impact on stock selection returns and can be considered a part of factor research. However, the factor is also highly sensitive to time, and we will study the impact of timing on this factor next.

The principle of the non-liquidity factor based on the shortest path of the K-line is to replace the original non-liquidity factor with the shortest path of the K-line.

Image

The variable R has changed over time in this model. Before the pandemic, the stock market's ILLIQ indeed could achieve larger coefficients, poor liquidity, and relatively higher returns under the same risk conditions. As shown in the graph (for the year 2017), our strategy yield came to 16%, 9.87% higher than the benchmark yield of 5.6%. Additionally, the Sharpe ratio reached a high of 2.7, significantly greater than 1, indicating that our returns far exceeded the risks. Although the volatility of the strategy was higher than the benchmark volatility by 0.07 this strategy still performed exceptionally well.

Image

In the following years, the COVID-19 pandemic swept across the globe. Major economies were severely impacted, and a global financial recession shook the markets. The existing liquidity impact indicators changed due to systemic risk factors that were beyond human intervention, such as the pandemic. For example, stocks that we previously considered to have high liquidity (under the assumption that all other factors remained constant) suddenly changed due to this new influencing factor, turning into stocks with extremely low liquidity. This directly led to a significant deviation in our original selection strategy, as the entire market's liquidity took a cliff-like plunge.

Under such conditions, it becomes difficult for our risk theory to hold, and logically, this strategy has already deviated from our initial expectations. As shown in the graph: this is from 2021 to 2022, the final period of the pandemic.

Image

Our strategy's returns were -21.17%, not much different from the benchmark return of -23%. Our strategy involved selecting the 10 stocks with the lowest liquidity, while the benchmark was the more liquid CSI 300 index. According to our hypothesis, the difference in returns between the two should be significant. However, during the pandemic, there was little difference between the two, and the direction and magnitude of their fluctuations were very similar. This also suggests that liquidity was very poor for all stocks during this period, rendering the strategy not very investable.

Next, we consider the post-pandemic economy, with the most recent data being for 2023. We can see that a strategy that outperformed the benchmark both before and during the pandemic now has a return of -3.87%, with a total return of -1.78%. The implication of these data is that our stocks with a high non-liquidity coefficient (low liquidity) have returns lower than those with high liquidity. This is completely contrary to our original hypothesis.

Image

Based on research along the timeline, I believe that this strategy is no longer applicable to the post-pandemic market. The market's liquidity is still in a period of recessionary pain. Only when the economy returns to its original healthy state can this strategy bring stable returns.