chenditc / investment_data

Scripts and doc for https://www.dolthub.com/repositories/chenditc/investment_data
272 stars 50 forks source link

数据是不是有问题 #9

Closed u2takey closed 1 year ago

u2takey commented 1 year ago

使用如下的配置,以及 2023-04-15 的数据跑出来结果异常差,好像不太对劲

'The following are analysis results of benchmark return(1day).' risk mean 0.000884 std 0.008006 annualized_return 0.210426 information_ratio 1.703677 max_drawdown -0.078120 'The following are analysis results of the excess return without cost(1day).' risk mean -0.000896 std 0.004709 annualized_return -0.213222 information_ratio -2.934913 max_drawdown -0.104559 'The following are analysis results of the excess return with cost(1day).' risk mean -0.001283 std 0.004692 annualized_return -0.305263 information_ratio -4.217365 max_drawdown -0.139517

qlib_init:
    provider_uri: "~/.qlib/qlib_data/cn_data"
    region: cn
market: &market csi500
benchmark: &benchmark SH000905
data_handler_config: &data_handler_config
    start_time: 2018-01-01
    end_time: 2023-04-01
    fit_start_time: 2018-01-01
    fit_end_time: 2021-12-31
    instruments: *market
port_analysis_config: &port_analysis_config
    strategy:
        class: TopkDropoutStrategy
        module_path: qlib.contrib.strategy
        kwargs:
            model: <MODEL> 
            dataset: <DATASET>
            topk: 50
            n_drop: 10
    backtest:
        start_time: 2022-11-01
        end_time: 2023-04-01
        account: 100000000
        benchmark: *benchmark
        exchange_kwargs:
            limit_threshold: 0.095
            deal_price: close
            open_cost: 0.0005
            close_cost: 0.0015
            min_cost: 5
task:
    model:
        class: LGBModel
        module_path: qlib.contrib.model.gbdt
        kwargs:
            loss: mse
            colsample_bytree: 0.9
            learning_rate: 0.1
            subsample: 0.9
            lambda_l1: 205.6999
            lambda_l2: 580.9768
            max_depth: 8
            num_leaves: 250
            num_threads: 20
    dataset:
        class: DatasetH
        module_path: qlib.data.dataset
        kwargs:
            handler:
                class: Alpha158
                module_path: qlib.contrib.data.handler
                kwargs: *data_handler_config
            segments:
                train: [2010-01-01, 2021-12-31]
                valid: [2022-01-01, 2022-10-31]
                test: [2022-11-01, 2023-04-01]
    record: 
        - class: SignalRecord
          module_path: qlib.workflow.record_temp
          kwargs: 
            model: <MODEL>
            dataset: <DATASET>
        - class: SigAnaRecord
          module_path: qlib.workflow.record_temp
          kwargs: 
            ana_long_short: False
            ann_scaler: 252
        - class: PortAnaRecord
          module_path: qlib.workflow.record_temp
          kwargs: 
            config: *port_analysis_config
chenditc commented 1 year ago

这个项目不直接排查模型或者其他组件的问题。

如果有具体看到某一天的数据有错误,麻烦指出具体哪一天的什么数据有问题,以及对照的数据源是什么。