DataCanvasIO / HyperTS

A Full-Pipeline Automated Time Series (AutoTS) Analysis Toolkit.
https://hyperts.readthedocs.io
Apache License 2.0
260 stars 27 forks source link

分类任务和回归任务时参数怎么设置 #87

Closed wangjianqiao111 closed 1 year ago

wangjianqiao111 commented 1 year ago

参数设置为 {'task': 'regression', 'mode': 'dl', 'timestamp': 'time', 'target': 'wendu', 'covariables': None, 'tf_gpu_usage_strategy': 0, 'tf_memory_limit': 2048.0, 'eval_size': 0.3, 'searcher': 'random', 'random_state': 99, 'cv': True, 'num_folds': 3, 'ensemble_size': None}

我在做时间序列回归时报错如下: 2023-03-31 14:13:15.998 [ERROR] 03-31 14:13:15 E hypernets.m.hyper_model.py 83 - run_trail failed! trail_no=1 2023-03-31 14:13:16.000 [ERROR] 03-31 14:13:16 E hypernets.m.hyper_model.py 85 - Traceback (most recent call last): 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/hypernets/model/hyper_model.py", line 76, in _run_trial 2023-03-31 14:13:16.000 [ERROR] fit_kwargs) 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/hyper_ts.py", line 175, in fit_cross_validation 2023-03-31 14:13:16.000 [ERROR] fold_est.fit(x_train_fold, y_train_fold, kwargs) 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/framework/wrappers/dl_wrappers.py", line 115, in fit 2023-03-31 14:13:16.000 [ERROR] X = self.fit_transform(X) 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/framework/wrappers/_base.py", line 152, in fit_transform 2023-03-31 14:13:16.000 [ERROR] transform_X = self.transformers.fit_transform(X) 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/sklearn/pipeline.py", line 367, in fit_transform 2023-03-31 14:13:16.000 [ERROR] Xt = self._fit(X, y, fit_params_steps) 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/sklearn/pipeline.py", line 296, in _fit 2023-03-31 14:13:16.000 [ERROR] fit_params_steps[name]) 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/joblib/memory.py", line 352, in call 2023-03-31 14:13:16.000 [ERROR] return self.func(*args, kwargs) 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/sklearn/pipeline.py", line 740, in _fit_transform_one 2023-03-31 14:13:16.000 [ERROR] res = transformer.fit_transform(X, y, fit_params) 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/sklearn/base.py", line 690, in fit_transform 2023-03-31 14:13:16.000 [ERROR] return self.fit(X, **fit_params).transform(X) 2023-03-31 14:13:16.000 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/utils/transformers.py", line 211, in transform 2023-03-31 14:13:16.000 [ERROR] transform_X = (X - self.min) / (self.max - self.min + self.eps) 2023-03-31 14:13:16.000 [ERROR] TypeError: unsupported operand type(s) for +: 'Timedelta' and 'float'

麻烦帮忙排查下是什么问题导致的报错,谢谢

wangjianqiao111 commented 1 year ago

当设置时间列为None时会出现如下报错信息: 2023-03-31 14:23:07.450 [ERROR] 03-31 14:23:07 E hypernets.m.hyper_model.py 83 - run_trail failed! trail_no=1 2023-03-31 14:23:07.451 [ERROR] 03-31 14:23:07 E hypernets.m.hyper_model.py 85 - Traceback (most recent call last): 2023-03-31 14:23:07.451 [ERROR] File "/usr/local/lib/python3.7/site-packages/hypernets/model/hyper_model.py", line 76, in _run_trial 2023-03-31 14:23:07.451 [ERROR] fit_kwargs) 2023-03-31 14:23:07.451 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/hyper_ts.py", line 175, in fit_cross_validation 2023-03-31 14:23:07.451 [ERROR] fold_est.fit(x_train_fold, y_train_fold, kwargs) 2023-03-31 14:23:07.452 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/framework/wrappers/dl_wrappers.py", line 115, in fit 2023-03-31 14:23:07.452 [ERROR] X = self.fit_transform(X) 2023-03-31 14:23:07.452 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/framework/wrappers/_base.py", line 152, in fit_transform 2023-03-31 14:23:07.452 [ERROR] transform_X = self.transformers.fit_transform(X) 2023-03-31 14:23:07.452 [ERROR] File "/usr/local/lib/python3.7/site-packages/sklearn/pipeline.py", line 367, in fit_transform 2023-03-31 14:23:07.452 [ERROR] Xt = self._fit(X, y, fit_params_steps) 2023-03-31 14:23:07.452 [ERROR] File "/usr/local/lib/python3.7/site-packages/sklearn/pipeline.py", line 296, in _fit 2023-03-31 14:23:07.452 [ERROR] fit_params_steps[name]) 2023-03-31 14:23:07.452 [ERROR] File "/usr/local/lib/python3.7/site-packages/joblib/memory.py", line 352, in call 2023-03-31 14:23:07.452 [ERROR] return self.func(*args, kwargs) 2023-03-31 14:23:07.452 [ERROR] File "/usr/local/lib/python3.7/site-packages/sklearn/pipeline.py", line 740, in _fit_transform_one 2023-03-31 14:23:07.452 [ERROR] res = transformer.fit_transform(X, y, fit_params) 2023-03-31 14:23:07.452 [ERROR] File "/usr/local/lib/python3.7/site-packages/sklearn/base.py", line 690, in fit_transform 2023-03-31 14:23:07.452 [ERROR] return self.fit(X, **fit_params).transform(X) 2023-03-31 14:23:07.452 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/utils/transformers.py", line 211, in transform 2023-03-31 14:23:07.452 [ERROR] transform_X = (X - self.min) / (self.max - self.min + self.eps) 2023-03-31 14:23:07.452 [ERROR] TypeError: unsupported operand type(s) for +: 'Timedelta' and 'float'

wangjianqiao111 commented 1 year ago

当设置参数mode=‘stats’时会出现以下报错信息: 2023-03-31 14:26:25.754 [ERROR] Traceback (most recent call last): 2023-03-31 14:26:25.754 [ERROR] File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main 2023-03-31 14:26:25.754 [ERROR] "main", mod_spec) 2023-03-31 14:26:25.754 [ERROR] File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code 2023-03-31 14:26:25.754 [ERROR] exec(code, run_globals) 2023-03-31 14:26:25.754 [ERROR] File "/opt/pylib/dc_runtime.zip/datacanvas/shell.py", line 134, in 2023-03-31 14:26:25.756 [ERROR] File "/opt/pylib/dc_runtime.zip/datacanvas/shell.py", line 120, in 2023-03-31 14:26:25.757 [ERROR] File "/opt/pylib/dc_runtime.zip/datacanvas/shell.py", line 29, in get_args_func 2023-03-31 14:26:25.758 [ERROR] File "/opt/pylib/dc_runtime.zip/datacanvas/shell.py", line 61, in _execfile 2023-03-31 14:26:25.760 [ERROR] File "main.py", line 138, in 2023-03-31 14:26:25.761 [ERROR] estimator.fit(X=train_df, params_dict) 2023-03-31 14:26:25.761 [ERROR] File "/opt/aps/code/project/eba707ef-010b-44df-b63a-5ddfb5b9c5e3/40c677ff-737d-47fb-a9a5-f718008ee479/hyperTSEstimator.py", line 42, in fit 2023-03-31 14:26:25.763 [ERROR] experiment = make_experiment(params_dict) 2023-03-31 14:26:25.763 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/experiment.py", line 625, in make_experiment 2023-03-31 14:26:25.763 [ERROR] search_space = default_search_space(task=task, metrics=reward_metric, covariates=actual_covariates) 2023-03-31 14:26:25.763 [ERROR] File "/usr/local/lib/python3.7/site-packages/hyperts/experiment.py", line 237, in default_search_space 2023-03-31 14:26:25.763 [ERROR] 'STATSRegressionSearchSpace is not implemented yet.' 2023-03-31 14:26:25.763 [ERROR] NotImplementedError: STATSRegressionSearchSpace is not implemented yet.

zhangxjohn commented 1 year ago

您给如上的信息,这里无法给出您解答。请提供更详细的使用说明,并且可以先试试阅读文档以了解使用方法【https://hyperts.readthedocs.io/en/latest/】。

zhangxjohn commented 1 year ago

时序回归和分类是没有时间列的,所以不能传数据中不能包含时间列,因此有如此错误 TypeError: unsupported operand type(s) for +: 'Timedelta' and 'float'。您可以看一下数据格式的要求。

zhangxjohn commented 1 year ago

当设置参数mode=‘stats’时会出现以下报错信息: 2023-03-31 14:26

stats模式不支持回归 NotImplementedError: STATSRegressionSearchSpace is not implemented yet.

wangjianqiao111 commented 1 year ago

Time Series Classification and Regression Required Format Differing from the forecasting tasks, the input data for classification and regression tasks are nested DataFrame, which means the variations over a time segment are listed in one cell. See example below.

var_col_0 var_col_1 var_col_2 ... var_col_n target x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x y x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x y x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x y x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x y x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x y x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x y x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x y x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x y x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x x, x, x, ..., x y Every row stands for one sample data, which has n+1 feature variables. The observations x, x, x, …, x of one variable (var_col_0) over a time period are listed in one cell (the top-left). Target y represents the label of the sample.

问:x不能只有一个吗,类似以下格式 var_col_0 var_col_1 var_col_2 ... var_col_n target x x x x y x x x x y x x x x y

zhangxjohn commented 1 year ago

不可以,在分类或者回归中,每一个DataFrame的cell里必须是一个序列,否则不满足时间序列的定义,也无法自回归式的学习。

wangjianqiao111 commented 1 year ago

好的

wangjianqiao111 commented 1 year ago

from hyperts import make_experiment experiment = make_experiment(train_data=train_data.copy(), mode='dl', task='regression', max_trials=5, target='y', searcher='random', contamination=0.1)

covariables=['HourSin', 'WeekCos', 'CBWD'])

model = experiment.run() Traceback (most recent call last): File "/Applications/anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3457, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 8, in contamination=0.1) File "/Applications/anaconda3/lib/python3.7/site-packages/hyperts/experiment.py", line 530, in make_experiment metrics=reward_metric, covariates=autual_covariates) File "/Applications/anaconda3/lib/python3.7/site-packages/hyperts/experiment.py", line 239, in default_search_space 'DLRegressionSearchSpace is not implemented yet.' NotImplementedError: DLRegressionSearchSpace is not implemented yet.

dl模式也不支持回归吗?如果是那什么模式可以做回归和分类呢