microsoft / qlib

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, market dynamics modeling, and RL.
https://qlib.readthedocs.io/en/latest/
MIT License
15.26k stars 2.61k forks source link

[DDG-DA] Will proxy model use future info? #1774

Closed ZhongHaoAustin closed 6 months ago

ZhongHaoAustin commented 6 months ago

❓ Questions and Assistance

Thank you for your excellent work on DDG-DA. While reviewing the code, I had a question: does the data dump for the proxy model include future information?

In the file qlib/contrib/rolling/ddgda.py at line 173, in the function _dump_data_for_proxy_model, you normalize the feature_selected by df.mean() and df.std(), both of which may potentially be known in the future.

l0ngc commented 6 months ago

I feel like, this data is retired by the train part of the dataset at line:166, in the file ddgda.py, which is already known. So no future data will be used.

我觉得这不会用到未来数据,这里是从train部分抽出来的数据,用来训练元模型,这里用的都是历史数据。

l0ngc commented 6 months ago

老哥,你训模型的时候,有遇到NanLoss的bug么, #1771 ,我卡在这儿了

ZhongHaoAustin commented 6 months ago

老哥,你训模型的时候,有遇到NanLoss的bug么, #1771 ,我卡在这儿了

https://github.com/microsoft/qlib/issues/1771. 在这里回复了,还有什么问题吗?

ZhongHaoAustin commented 6 months ago

Apologies for any confusion. In qlib/contrib/rolling/ddgda.py at line 173, each DataFrame corresponds to a specific date. There is no information included regarding future events or data points.