microsoft / qlib

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, market dynamics modeling, and RL.
https://qlib.readthedocs.io/en/latest/
MIT License
15.35k stars 2.63k forks source link

Get original data from dataset instance #364

Closed fzc621 closed 3 years ago

fzc621 commented 3 years ago

Hi,

I am working on creating my own RecordTemplates, which needs some original data from the dataset, say "vwap" or "volume". So I have to save "vwap.pkl" or "volume.pkl" within my own SignalRecorder. Is it possible to save those objects, similar to the "label.pkl" saved by Qlib's SignalRecoder? I am able to make it in a "dirty" way:

vwap_df = D.features(D.instruments(market="all"), ["$vwap"], start_time="2019-07-01", end_time="2019-12-31")
vwap = vwap_df.swaplevel().sort_index() 
self.recorder.save_objects(**{"vwap.pkl": vwap})

Is there any better way to make it? Using the dataset class? I got stuck in retrieving the original data via dataset class.

Thanks

you-n-g commented 3 years ago

@fzc621
This is really a great Quesetion. I think minor midifications are required if you want to record something about the database.

  1. Create a base class DatasetRecord for storing info from database.

  2. Change the workflow of the trainner.

    if  issubclass(recorder,  DatasetRecord):
    pass the dataset when initialization recorder 

    Example and functions below maybe helpful

  3. Create your own subclass of DatabaseRecord and save your info in the recorder.

Your PR is welcome! :)

fzc621 commented 3 years ago

Thanks for your reply! I will try to submit a PR!