nictra23 / Project

0 stars 0 forks source link

Industry Analysis for Multifactor trading strategy #5

Open nictra23 opened 1 year ago

nictra23 commented 1 year ago

After selecting the strategy, we conducted backtesting for the PSPB quality factor. As a result, we obtained the top 10% of stocks from the selected pool (target stocks). This is because our subsequent strategy focuses on selecting top-performing stocks for testing the strategy's effectiveness, as well as these stocks exhibit the highest returns. The following graph depicts the distribution of these stocks across different industries.

Image

We compare the proportion of all original stock industries with the proportion of strategic stock industries:

Image

We can observe that the strategy's proportion in the sectors of raw materials, real estate, energy, optional consumption, and public utilities exceeds the overall industry proportion. Partial reductions were made in other industries. Overall, raw materials and industrials continue to dominate. Moreover, we were able to realize returns through subsequent backtesting, and we reduced allocations in underperforming industries such as finance, major consumption, and healthcare. To a certain extent, this reflects that the strategy has indeed selected excellent income-generating stocks.

CODE:

Import necessary libraries

import matplotlib.pyplot as plt
from jqdatasdk import *
import pandas as pd

# Authenticate your account
auth('', '')

# Read the Excel file
df = pd.read_excel(r'C:\Users\nick\OneDrive\文档\position.xlsx')

# Convert the '日期' column to datetime
df['日期'] = pd.to_datetime(df['日期'])

# Extract the stock code from the '标的' column and convert it to string
df['stock_code'] = df['标的'].str.extract(r'\((.*?)\)').astype(str)

# Filter the DataFrame for the date '02/08/2021'
df_filtered = df[df['日期'] == pd.to_datetime('2021-08-02')]

# Get all the stocks
all_stocks = df_filtered['stock_code']

# Create an empty dictionary to store the stocks by industry
stocks_by_industry = {}

# Loop through all the stocks
for stock in all_stocks:
    # Get the industry of the stock
    industry_data = get_industry(stock, date=None)[stock]

    # Check if 'jq_l1' is in the dictionary
    if 'jq_l1' in industry_data:
        industry_name = industry_data['jq_l1']['industry_name']

        # If the industry is not in the dictionary, add it
        if industry_name not in stocks_by_industry:
            stocks_by_industry[industry_name] = []

        # Add the stock to the appropriate industry
        stocks_by_industry[industry_name].append(stock)

stocks_count = [len(stocks) for stocks in stocks_by_industry.values()]

# Create a pie chart
plt.figure(figsize=(10, 10))
plt.pie(stocks_count, labels=stocks_by_industry.keys(), autopct='%1.1f%%')

# Add a title
plt.title('Stocks by Industry')

# Display the pie chart
plt.show()
nictra23 commented 1 year ago

Analysis of the Original Strategy for Individual Industries:

Methodology: After excluding stocks with price limits and ST stocks, we categorized the stock pool by industry. Subsequently, we sorted the stocks within each industry based on the original strategy of PBPS + quality factor, followed by conducting backtesting and analysis in intervals. Industrial Sector Analysis:

Next, within the factor score ranking interval backtesting, we selected intervals of 10%, 30%... up to 90%. In each interval, all stocks falling within the interval were included in the selection pool for testing purposes. This process was conducted on a monthly basis.

Image

Image

Image

Image

Image

Image

Image

Image

From the profit graph, we can observe that within the industrial sector, the PBPS quality factor strategy demonstrates a notable downward unidirectional trend. While there is some fluctuation, the overall trend remains present, and the top 10% of stocks exhibit a substantial return of up to 27%. The risk is also quite similar to the benchmark (with a difference of 0.01), which indicates that this strategy's stock selection has been successful within the industrial stock pool. Furthermore, considering that the industrial sector comprises a significant portion (30%) of the total stock pool, it indirectly suggests that the stock selection factor of this strategy has successfully identified high-return stocks within the industrial sector.

CODE: import numpy as np import talib import pandas import scipy as sp import scipy.optimize import datetime as dt from scipy import linalg as sla from scipy import spatial from jqdata import * import smtplib from email.mime.text import MIMEText from email.header import Header import statsmodels.api as sm

  def initialize(context):
      # Use CSI 300 as the benchmark
      set_benchmark('000300.XSHG')
      # Slippage, real prices
      set_slippage(FixedSlippage(0.000))
      set_option('use_real_price', True)

      # Turn off some logs
      log.set_level('order', 'error')
      run_monthly(rebalance,1, time='9:30')

  def after_code_changed(context):
      g.quantlib = quantlib()
      # Define risk exposure
      g.quantlib.fun_set_var(context, 'riskExposure', 0.03)
      # Normal distribution probability table, standard deviation multiples and confidence levels
      g.quantlib.fun_set_var(context, 'confidencelevel', 1.96)
      # Rebalancing parameters
      g.quantlib.fun_set_var(context, 'hold_cycle', 30)
      g.quantlib.fun_set_var(context, 'hold_periods', 0)
      g.quantlib.fun_set_var(context, 'stock_list', [])
      g.quantlib.fun_set_var(context, 'position_price', {})
      g.quantlib.fun_set_var(context, 'version', 1.3)

      if context.version < 1.3:
          context.hold_periods = 0
          context.riskExposure = 0.03
          context.version = 1.3

  def before_trading_start(context):
      # Define stock pool
      moneyfund = []
      fund = []
      # Exclude stocks that have been listed for less than 60 days
      context.moneyfund = g.quantlib.fun_delNewShare(context, moneyfund, 60)
      context.fund = g.quantlib.fun_delNewShare(context, fund, 60)

      # Record pre-market returns
      context.returns = {}
      context.returns['algo_before_returns'] = context.portfolio.returns

  def rebalance(context):
      # Reference libraries
      g.GP = Gross_Profitability_lib()
      g.quantlib = quantlib()
      context.msg = ""

      # Check if rebalancing is needed
      rebalance_flag, context.position_price, context.hold_periods, msg = \
          g.quantlib.fun_needRebalance('GP algo ', context.moneyfund, context.stock_list, context.position_price,
                                       context.hold_periods, context.hold_cycle, 0.25)
      context.msg += msg

      statsDate = context.current_dt.date()
      trade_style = False
      if rebalance_flag:
          stock_list, bad_stock_list = [], []
          GP_stock_list = g.GP.fun_get_stock_list(context, statsDate, bad_stock_list, stock_list)
          stock_list = stock_list + GP_stock_list
          # Allocate positions
          equity_ratio, bonds_ratio = g.quantlib.fun_assetAllocationSystem(stock_list, context.moneyfund, statsDate)

          risk_ratio = 0
          if len(equity_ratio.keys()) >= 1:
              risk_ratio = context.riskExposure / len(equity_ratio.keys())
          # Allocate positions based on pre-set risk exposure
          position_ratio = g.quantlib.fun_calPosition(equity_ratio, bonds_ratio, 1.0, risk_ratio, context.moneyfund,
                                                      context.portfolio.portfolio_value, context.confidencelevel,
                                                      statsDate)
          trade_style = True
          context.stock_list = list(position_ratio.keys())

          # Update desired purchase prices
          context.position_price = g.quantlib.fun_update_positions_price(position_ratio)
          # Sell stocks that are in the portfolio but not in the desired purchase list
          for stock in context.portfolio.positions.keys():
              if stock not in position_ratio:
                  position_ratio[stock] = 0
          context.position_ratio = position_ratio
          print(position_ratio)  # Modified print statement for Python 3

      # Rebalance, execute trades
      g.quantlib.fun_do_trade(context, context.position_ratio, context.moneyfund, trade_style)

  class Gross_Profitability_lib():
      def fun_get_stock_list(self, context, statsDate=None, bad_stock_list=[], candidate=[]):
          df = get_fundamentals(
              query(valuation.code, valuation.market_cap, valuation.pe_ratio, valuation.ps_ratio, valuation.pb_ratio)
          )
          all_stocks = df['code'].tolist()
          industry_data = get_industry(all_stocks, date=None) # Assuming get_industry can take a list

      # Filter out the stocks in the "工业" industry
          industry_stocks = [stock for stock, data in industry_data.items() if 'jq_l1' in data and data['jq_l1']['industry_name'] == '金融']

      # Filter the dataframe for stocks in the "工业" industry
          df = df[df['code'].isin(industry_stocks)].reset_index(drop=True)

          df1 = df['code'].tolist()
          positions_list = list(context.portfolio.positions.keys())
          df1 = g.quantlib.unpaused(df1, positions_list)
          df1 = g.quantlib.remove_st(df1, statsDate)

          df1 = g.quantlib.remove_limit_up(df1, positions_list)
          df = df[df.code.isin(df1)]
          df = df.reset_index(drop=True)

          set_PB = {}
          set_PS = {}

          set_PR = {}
          good_stocks = {}

          fPB = df.sort_values(by=['pb_ratio'], ascending=True)
          fPB = fPB.reset_index(drop=True)
          fPB = fPB[fPB.pb_ratio > 0]
          fPB = fPB.reset_index(drop=True)

          sListPB = fPB['code'].tolist()
          for i, v in enumerate(sListPB, 1):
              set_PB[v] = int(i)

          fPS = df.sort_values(by=['ps_ratio'], ascending=True)
          fPS = fPS.reset_index(drop=True)
          sListPS = fPS['code'].tolist()
          for i, v in enumerate(sListPS, 1):
              set_PS[v] = int(i)

          df2 = get_fundamentals(
              query(income.code, income.total_operating_revenue, income.total_operating_cost, balance.total_assets),
              date=statsDate - dt.timedelta(1)
          )

          df2 = df2.fillna(value=0)
          df2 = df2[df2.total_operating_revenue > 0]
          df2 = df2.reset_index(drop=True)
          df2 = df2[df2.total_assets > 0]
          df2 = df2.reset_index(drop=True)

          df2['GP'] = 1.0 * (df2['total_operating_revenue'] - df2['total_operating_cost']) / df2['total_assets']

          df2 = df2.drop(['total_assets', 'total_operating_revenue', 'total_operating_cost'], axis=1)
          df2 = df2.sort_values(by='GP', ascending=False)
          PR = df2['code'].tolist()
          for i, v in enumerate(PR, 1):
              set_PR[v] = int(i)
          for stock in df1:

              ps_value = set_PS.get(stock, -1)
              pb_value = set_PB.get(stock, -1)

              pr_value = set_PR.get(stock, -1)
              if  ps_value>0 and pb_value>0 and  pr_value>0 :
                  c_n =  ps_value + pb_value  + pr_value
                  good_stocks[stock]=c_n

          good_stocks = dict(sorted(good_stocks.items(), key=lambda x: x[1]))
          stock_lists = list(good_stocks.keys())

          stock_list = good_stocks

          return  stock_lists[0:int(0.1 * len(stock_lists))-1]

  class quantlib:

      def fun_set_var(self, context, var_name, var_value):
          if var_name not in dir(context):
              setattr(context, var_name, var_value)

      def fun_check_price(self, algo_name, stock_list, position_price, gap_trigger):
          flag = False
          msg = ""
          if stock_list:
              h = history(1, '1d', 'close', stock_list, df=False)
              for stock in stock_list:
                  cur_price = h[stock][0]
                  if stock not in position_price:
                      position_price[stock] = cur_price
                  old_price = position_price[stock]
                  if old_price != 0:
                      delta_price = abs(cur_price - old_price)
                      if delta_price / old_price > gap_trigger:
                          msg = f"{algo_name} 需要调仓: {stock},现价: {cur_price} / 原价格: {old_price}\n"
                          flag = True
                          return flag, position_price, msg
          return flag, position_price, msg

      def fun_needRebalance(self, algo_name, moneyfund, stock_list, position_price, hold_periods, hold_cycle, gap_trigger):
          msg = ""
          msg += algo_name + "离下次调仓还剩 " + str(hold_periods) + " 天\n"
          rebalance_flag = False

          stocks_count = 0
          for stock in stock_list:
              if stock not in moneyfund:
                  stocks_count += 1
          if stocks_count == 0:
              msg += algo_name + "调仓,因为持股数为 0 \n"
              rebalance_flag = True
          elif hold_periods == 0:
              msg += algo_name + "调仓,因为持股天数剩余为 0 \n"
              rebalance_flag = True
          if not rebalance_flag:
              rebalance_flag, position_price, msg2 = self.fun_check_price(algo_name, stock_list, position_price, gap_trigger)
              msg += msg2

          if rebalance_flag:
              hold_periods = hold_cycle
          else:
              hold_periods -= 1

          return rebalance_flag, position_price, hold_periods, msg

      # 更新持有股票的价格,每次调仓后跑一次
      def fun_update_positions_price(self, ratio):
          position_price = {}
          if ratio:
              h = history(1, '1m', 'close', ratio.keys(), df=False)
              for stock in ratio.keys():
                  if ratio[stock] > 0:
                      position_price[stock] = round(h[stock][0], 3)
          return position_price

      def fun_assetAllocationSystem(self, stock_list, moneyfund, statsDate=None):
          def __fun_getEquity_ratio(__stocklist, limit_up=1.0, limit_low=0.0, statsDate=None):
              __ratio = {}
              # Check if there is any stock in the list
              if __stocklist:
                  # Calculate equal ratio for each stock
                  equal_ratio = 1.0 / len(__stocklist)
                  for stock in __stocklist:
                      __ratio[stock] = equal_ratio

              return __ratio

          equity_ratio = __fun_getEquity_ratio(stock_list, 1.0, 0.0, statsDate)
          bonds_ratio  = __fun_getEquity_ratio(moneyfund, 1.0, 0.0, statsDate)

          return equity_ratio, bonds_ratio

      def fun_calPosition(self, equity_ratio, bonds_ratio, algo_ratio, risk_ratio, moneyfund, portfolio_value, confidencelevel, statsDate=None):
          '''
          equity_ratio 资产配仓结果
          bonds_ratio 债券配仓结果
          algo_ratio 策略占市值的百分比
          risk_ratio 每个标的承受的风险系数
          '''
          trade_ratio = equity_ratio # 例子,简单处理,略过
          return trade_ratio

      def fun_do_trade(self, context, trade_ratio, moneyfund, trade_style):

          def __fun_tradeStock(context, stock, ratio, trade_style):
              total_value = context.portfolio.portfolio_value
              self.fun_trade(context, stock, ratio*total_value)

          trade_list = trade_ratio.keys()
          myholdstock = context.portfolio.positions.keys()
          stock_list = list(set(trade_list).union(set(myholdstock)))
          total_value = context.portfolio.portfolio_value

          # 已有仓位
          holdDict = {}
          h = history(1, '1d', 'close', stock_list, df=False)
          for stock in myholdstock:
              tmpW = np.around(context.portfolio.positions[stock].total_amount * h[stock], decimals=2)
              holdDict[stock] = float(tmpW)

          # 对已有仓位做排序
          tmpDict = {}
          for stock in holdDict:
              if stock in trade_ratio:
                  tmpDict[stock] = round((trade_ratio[stock] - holdDict[stock]), 2)
          tradeOrder = sorted(tmpDict.items(), key=lambda d:d[1], reverse=False)

          # 交易已有仓位的股票,从减仓的开始,腾空现金
          _tmplist = []
          for idx in tradeOrder:
              stock = idx[0]
              __fun_tradeStock(context, stock, trade_ratio[stock], trade_style)
              _tmplist.append(stock)

          # 交易新股票
          for i in range(len(trade_list)):
              stock = list(trade_list)[i]
              if len(_tmplist) != 0 :
                  if stock not in _tmplist:
                      __fun_tradeStock(context, stock, trade_ratio[stock], trade_style)
              else:
                  __fun_tradeStock(context, stock, trade_ratio[stock], trade_style)

      def unpaused(self, stock_list, positions_list):
          current_data = get_current_data()
          tmpList = []
          for stock in stock_list:
              if not current_data[stock].paused or stock in positions_list:
                  tmpList.append(stock)
          return tmpList

      def remove_st(self, stock_list, statsDate):
          current_data = get_current_data()
          return [s for s in stock_list if not current_data[s].is_st]

      # 剔除涨停板的股票(如果没有持有的话)
      def remove_limit_up(self, stock_list, positions_list):
          h = history(1, '1m', 'close', stock_list, df=False, skip_paused=False, fq='pre')
          h2 = history(1, '1m', 'high_limit', stock_list, df=False, skip_paused=False, fq='pre')
          tmpList = []
          for stock in stock_list:
              if h[stock][0] < h2[stock][0] or stock in positions_list:
                  tmpList.append(stock)

          return tmpList

      # 剔除上市时间较短的产品
      def fun_delNewShare(self, context, equity, deltaday):
          deltaDate = context.current_dt.date() - dt.timedelta(deltaday)

          tmpList = []
          for stock in equity:
              if get_security_info(stock).start_date < deltaDate:
                  tmpList.append(stock)

          return tmpList

      def fun_trade(self, context, stock, value):
          self.fun_setCommission(context, stock)
          order_target_value(stock, value)

      def fun_setCommission(self, context, stock):
          # 将滑点设置为0
          set_slippage(FixedSlippage(0)) 
          # 根据不同的时间段设置手续费
          dt=context.current_dt
          if dt>datetime.datetime(2013,1, 1):
              if stock in context.moneyfund:
                  set_order_cost(OrderCost(open_tax=0, close_tax=0, open_commission=0, close_commission=0, close_today_commission=0, min_commission=0), type='fund')
              else:
                  set_order_cost(OrderCost(open_tax=0, close_tax=0.001, open_commission=0.0003, close_commission=0.0013, close_today_commission=0, min_commission=5), type='stock')
          elif dt>datetime.datetime(2011,1, 1):        
              if stock in context.moneyfund:
                  set_order_cost(OrderCost(open_tax=0, close_tax=0, open_commission=0, close_commission=0, close_today_commission=0, min_commission=0), type='fund')
              else:
                  set_order_cost(OrderCost(open_tax=0, close_tax=0.001, open_commission=0.001, close_commission=0.002, close_today_commission=0, min_commission=5), type='stock')
          elif dt>datetime.datetime(2009,1, 1):        
              if stock in context.moneyfund:
                  set_order_cost(OrderCost(open_tax=0, close_tax=0, open_commission=0, close_commission=0, close_today_commission=0, min_commission=0), type='fund')
              else:
                  set_order_cost(OrderCost(open_tax=0, close_tax=0.001, open_commission=0.002, close_commission=0.003, close_today_commission=0, min_commission=5), type='stock')
          else:        
              if stock in context.moneyfund:
                  set_order_cost(OrderCost(open_tax=0, close_tax=0, open_commission=0, close_commission=0, close_today_commission=0, min_commission=0), type='fund')
              else:
                  set_order_cost(OrderCost(open_tax=0, close_tax=0.001, open_commission=0.003, close_commission=0.004, close_today_commission=0, min_commission=5), type='stock')