microsoft / qlib

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, market dynamics modeling, and RL.
https://qlib.readthedocs.io/en/latest/
MIT License
14.54k stars 2.53k forks source link

Questions about 'get_feature_config' function in class HighFreqGeneralHandler #1786

Open caozhenxiang-kouji opened 1 month ago

caozhenxiang-kouji commented 1 month ago

When I run the follwing code python scripts/gen_pickle_data_hsm.py -c scripts/pickle_data_config.yml , I print out the generated results of self.get_feature_config() in class HighFreqGeneralHandler. The results are as follows: ['Cut(FFillNan(If(IsNull($open), $close, $open)/DayLast(Ref(FFillNan($close), 480))), 480, None)', 'Cut(FFillNan(If(IsNull($high), $close, $high)/DayLast(Ref(FFillNan($close), 480))), 480, None)', 'Cut(FFillNan(If(IsNull($low), $close, $low)/DayLast(Ref(FFillNan($close), 480))), 480, None)', 'Cut(FFillNan(If(IsNull($close), $close, $close)/DayLast(Ref(FFillNan($close), 480))), 480, None)', 'Cut(FFillNan(Ref(If(IsNull($open), $close, $open), 240)/DayLast(Ref(FFillNan($close), 240))), 480, None)', 'Cut(FFillNan(Ref(If(IsNull($high), $close, $high), 240)/DayLast(Ref(FFillNan($close), 240))), 480, None)', 'Cut(FFillNan(Ref(If(IsNull($low), $close, $low), 240)/DayLast(Ref(FFillNan($close), 240))), 480, None)', 'Cut(FFillNan(Ref(If(IsNull($close), $close, $close), 240)/DayLast(Ref(FFillNan($close), 240))), 480, None)', 'Cut(If(IsNull($volume/Ref(DayLast(Mean($volume, 7200)), 240)), 0, $volume/Ref(DayLast(Mean($volume, 7200)), 240)), 480, None)', 'Cut(If(IsNull(Ref($volume, 240)/Ref(DayLast(Mean($volume, 7200)), 240)), 0, Ref($volume, 240)/Ref(DayLast(Mean($volume, 7200)), 240)), 480, None)' ], ['$open', '$high', '$low', '$close', '$open_1', '$high_1', '$low_1', '$close_1', '$volume', '$volume_1']

I has 2 questions here: First, I believe the variable 'day length' in class HighFreqGeneralHandler should be divided by freq. In my case, the freq is 5 min, so a day only has a length of 48. The Ref($close, 480) here just indicates the close price 10 days ago. Second, the actual usage of function 'get_normalized_price_feature' is inconsistent with the annotations here. However, it seems that the open,close,high,low of today will be divided by the close price 2 days ago, and the those of yesterday will be divided by the close price of yesterday(and then Ref again? I don't know why). I 'm a little confused here and is looking forward for a more detailed explanation.