microsoft / qlib

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, market dynamics modeling, and RL.
https://qlib.readthedocs.io/en/latest/
MIT License
14.54k stars 2.53k forks source link

qlib在client存储featrues的股票字段数据使用np进行二进制文件的存储,为何不使用HDF5操作数据呢? #1807

Open mrvegazhou opened 2 weeks ago

mrvegazhou commented 2 weeks ago

❓ Questions and Help

很想知道qlib的工程师为何要以bin的方式存储,而不用HDF5呢?

jimrok commented 1 week ago

因为np的数据结构简单,就是float数据的序列化,有固定的长度,想要哪一日的数据,可以计算出偏移的位置,直接读取。理论上hdf5或者其他的库不能再更快了。单这个结构太难为维护了,基础数据稍微有些错误,你很难维护,他们没有提供完整的修复工具。看懂源代码是可以自己去修复的。