Open ZhaoyangLiu-Leo opened 2 years ago
@TowardSun You really got a sharp mind!
When we first design this strategy, we assume users know which stock is tradable (this is possible in most cases).
Then we try to implement another version that strictly prevents the leakage of future data (the tradable info on T + 1
will not leak on T
), but the current implementation for only_tradable==False
is not perfect.
Your point about the risk degree is right, too.
Would you like to become a Qlib contributor and make the strategy better? Discussions and PRs are welcome :)
Thanks for your invitation.
Currently, I overwrite the generate_trade_decision
function in TopkDropoutStrategy and set only_tradable=True
by default. It is not a perfect update but currently can guarantee the number of portfolio instruments is equal to the top k.
The detailed implementation of the function generate_trade_decision
:
def generate_trade_decision(self, execute_result=None):
# get the number of trading step finished, trade_step can be [0, 1, 2, ..., trade_len - 1]
trade_step = self.trade_calendar.get_trade_step()
trade_start_time, trade_end_time = self.trade_calendar.get_step_time(trade_step)
pred_start_time, pred_end_time = self.trade_calendar.get_step_time(trade_step, shift=1)
pred_score = self.signal.get_signal(start_time=pred_start_time, end_time=pred_end_time)
if pred_score is None:
return TradeDecisionWO([], self)
if self.only_tradable:
# If The strategy only consider tradable stock when make decision
# It needs following actions to filter stocks
def get_first_n(l, n, reverse=False):
cur_n = 0
res = []
for si in reversed(l) if reverse else l:
if self.trade_exchange.is_stock_tradable(
stock_id=si, start_time=trade_start_time, end_time=trade_end_time
):
res.append(si)
cur_n += 1
if cur_n >= n:
break
return res[::-1] if reverse else res
def get_last_n(l, n):
return get_first_n(l, n, reverse=True)
def filter_stock(l):
return [
si
for si in l
if self.trade_exchange.is_stock_tradable(
stock_id=si, start_time=trade_start_time, end_time=trade_end_time
)
]
else:
# Otherwise, the stock will make decision without the stock tradable info
def get_first_n(l, n):
return list(l)[:n]
def get_last_n(l, n):
return list(l)[-n:]
def filter_stock(l):
return l
current_temp = copy.deepcopy(self.trade_position)
# generate order list for this adjust date
sell_order_list = []
buy_order_list = []
# load score
cash = current_temp.get_cash()
current_stock_list = current_temp.get_stock_list()
# last position (sorted by score)
last = pred_score.reindex(current_stock_list).sort_values(ascending=False).index
# The new stocks today want to buy **at most**
if self.method_buy == "top":
today = get_first_n(
pred_score[~pred_score.index.isin(last)].sort_values(ascending=False).index,
self.n_drop + self.topk - len(last),
)
elif self.method_buy == "random":
topk_candi = get_first_n(pred_score.sort_values(ascending=False).index, self.topk)
candi = list(filter(lambda x: x not in last, topk_candi))
n = self.n_drop + self.topk - len(last)
try:
today = np.random.choice(candi, n, replace=False)
except ValueError:
today = candi
else:
raise NotImplementedError(f"This type of input is not supported")
# combine(new stocks + last stocks), we will drop stocks from this list
# In case of dropping higher score stock and buying lower score stock.
comb = pred_score.reindex(last.union(pd.Index(today))).sort_values(ascending=False).index
# Get the stock list we really want to sell (After filtering the case that we sell high and buy low)
if self.method_sell == "bottom":
sell = last[last.isin(get_last_n(comb, self.n_drop))]
elif self.method_sell == "random":
candi = filter_stock(last)
try:
sell = pd.Index(np.random.choice(candi, self.n_drop, replace=False) if len(last) else [])
except ValueError: # No enough candidates
sell = candi
else:
raise NotImplementedError(f"This type of input is not supported")
for code in current_stock_list:
if not self.trade_exchange.is_stock_tradable(
stock_id=code, start_time=trade_start_time, end_time=trade_end_time
):
continue
if code in sell:
# check hold limit
time_per_step = self.trade_calendar.get_freq()
if current_temp.get_stock_count(code, bar=time_per_step) < self.hold_thresh:
continue
# sell order
sell_amount = current_temp.get_stock_amount(code=code)
factor = self.trade_exchange.get_factor(
stock_id=code, start_time=trade_start_time, end_time=trade_end_time
)
# sell_amount = self.trade_exchange.round_amount_by_trade_unit(sell_amount, factor)
sell_order = Order(
stock_id=code,
amount=sell_amount,
start_time=trade_start_time,
end_time=trade_end_time,
direction=Order.SELL, # 0 for sell, 1 for buy
)
# is order executable
if self.trade_exchange.check_order(sell_order):
sell_order_list.append(sell_order)
trade_val, trade_cost, trade_price = self.trade_exchange.deal_order(
sell_order, position=current_temp
)
# update cash
cash += trade_val - trade_cost
# buy new stock
# note the current has been changed
# Get the stock list we really want to buy
buy = today[: len(sell_order_list) + self.topk - len(last)]
current_stock_list = current_temp.get_stock_list()
value = cash * self.risk_degree / len(buy) if len(buy) > 0 else 0
# open_cost should be considered in the real trading environment, while the backtest in evaluate.py does not
# consider it as the aim of demo is to accomplish same strategy as evaluate.py, so comment out this line
# value = value / (1+self.trade_exchange.open_cost) # set open_cost limit
for code in buy:
# check is stock suspended
if not self.trade_exchange.is_stock_tradable(
stock_id=code, start_time=trade_start_time, end_time=trade_end_time
):
continue
# buy order
buy_price = self.trade_exchange.get_deal_price(
stock_id=code, start_time=trade_start_time, end_time=trade_end_time, direction=OrderDir.BUY
)
buy_amount = value / buy_price
factor = self.trade_exchange.get_factor(stock_id=code, start_time=trade_start_time, end_time=trade_end_time)
buy_amount = self.trade_exchange.round_amount_by_trade_unit(buy_amount, factor)
buy_order = Order(
stock_id=code,
amount=buy_amount,
start_time=trade_start_time,
end_time=trade_end_time,
direction=Order.BUY, # 1 for buy
)
buy_order_list.append(buy_order)
return TradeDecisionWO(sell_order_list + buy_order_list, self)
I have checked that by printing the position information from the backtest results.
pos_dicts = dict([(key, value.position) for key, value in positions.items()])
pos_lens = dict([(key, len(value.keys())) for key, value in pos_dicts.items()])
At present, the default dataset provided by qlib does not have the change
information.
Therefore, the tradable check on price limit basically failed, I think.
The best solution may be that the users will have another dataset with the change information.
Or we update the qlib.backtest.exchange.py
by inserting the code:
close_column = "$close"
change = self.quote_df[close_column].groupby("instrument").apply(
lambda price: price / price.shift(1) - 1.0).fillna(0.0)
self.quote_df["$change"] = change
self._update_limit(self.limit_threshold)
after line 210.
I am not sure about the data leakage in the backtest strategy, since we have shifted the prediction score to match the trading days.
pred_start_time, pred_end_time = self.trade_calendar.get_step_time(trade_step, shift=1)
pred_score = self.signal.get_signal(start_time=pred_start_time, end_time=pred_end_time)
If the deal price is close, the tradable check of price limit and instruments ranking list is consistent on the same day.
@TowardSun I think your update about the strategy LGTM. Could you send a PR to merge it?
At present, the default dataset provided by qlib does not have the change information.
Qlib's default dataset provides the change field, it is stored in paths like ~/.qlib/qlib_data/cn_data/features/sh600519/change.day.bin
Hi @TowardSun
I have a question to existing implementation. It seems the sell and buy can happen in a same day. Actually, cash would be back before market close. i.e. there's no cash to buy new stocks. Do you think if it is a problem?
**cash += trade_val - trade_cost**
# buy new stock
# note the current has been changed
# current_stock_list = current_temp.get_stock_list()
**value = cash * self.risk_degree / len(buy) if len(buy) > 0 else 0**
The bug in this strategy seems to be this specific conditional check for holding threshold.
if current_temp.get_stock_count(code, bar=time_per_step) < self.hold_thresh:
continue
In real life, you would execute sell purely based on scores and not worry about the holding threshold. This would also prune the bottom scores regularly and keep a strict boundary on topK.
Hello, thanks for the great effort for the qlib project.
My issue
I found some wried behaviors when using TopkDropoutStrategy strategy. I expected that the number of portfolio instruments in each day be equal to the top k number. However, due to the tradable check in the current implementation, the portfolio number changed each day.
The reasons may cause that.
get_first_n, get_last_n
function and the dealing process. Even we set the only_tradable as False, we also check the instruments can be tradable or not.open_cost should be considered in the real trading environment, while the backtest in evaluate.py does not
consider it as the aim of demo is to accomplish same strategy as evaluate.py, so comment out this line
value = value / (1+self.trade_exchange.open_cost) # set open_cost limit
for code in buy:
check is stock suspended