microsoft / qlib

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, market dynamics modeling, and RL.
https://qlib.readthedocs.io/en/latest/
MIT License
15.49k stars 2.64k forks source link

$CHANGE is calculated based on close instead of adj_close in the YahooCollector #1563

Open xu-li opened 1 year ago

xu-li commented 1 year ago

🐛 Bug Description

$CHANGE is calculated based on close instead of adj_close in the YahooCollector

To Reproduce

Steps to reproduce the behavior:

  1. Download the data from Yahoo Finance. Using 301303.SZ as an example only because the dataset is small.
  2. Notice the following data in the downloaded csv (removed unnecessary columns for demonstration purpose) Date,Close,Adj Close 2023-06-05,23.120001,22.893938 2023-06-06,22.500000,22.280001 2023-06-07,22.240000,22.240000
  3. Normalize the downloaded csv file using data_collector.yahoo.collector.YahooNormalizeCN1d.
  4. Check the normalized data. The change on 2023-06-06 is -0.026816608996539815 which is correct and the change on 2023-06-07 is -0.011555555555555652 which is wrong.

Expected Behavior

The change on 2023-06-07 should be 22.240000/22.280001 - 1. = -0.0017953769391662044.

Screenshot

Environment

Linux x86_64 Python version: 3.8.16 Qlib version: 0.9.1

Additional Notes

wenwenzju commented 1 year ago

Did you solve this problem? I have the same doubt. $change is used to determine whether the stock is up or down in the backtest, and thus decide whether the stock can be traded. However, using the unadjusted $close to calculate the increase or decrease is unreasonable. If the stock undergoes rights issue, split, etc., it will inevitably cause a huge change in $close, which will lead to a huge change in the increase or decrease. In the backtest, it will be considered as a limit up or limit down, which is unreasonable.

xu-li commented 1 year ago

No, I didn't.

I use the data from https://github.com/chenditc/investment_data/.