Open ramdhan1989 opened 4 years ago
Hi @ramdhan1989
Thanks for your interest in our tool, and forgive-me for the long delay.
First of all, before hyperparameter optimization (hereafter called hyperopt), you should perform the time series analysis (such as ACF/PACF plots, tests of stationarity and scedasticity, etc). Hyperopt does not unuseful to know how your time-series data behaves.
The hyperparameter optimization of FTS is described here, and is called DEHO - Distributed Evolutionary Hyperparameter Optimization, but there are other methods then evolutionary in the library. The return of the method will be a dictionary with the best parameters found for forecasting the dataset using the selected FTS method (in the parameter fts_method).
Below a list of the implemented methods:
from pyFTS.hyperparam import GridSearch
from pyFTS.models import hofts
from pyFTS.data import TAIEX
datasetname = 'TAIEX'
dataset = TAIEX.get_data()
#The list of hyperparameters search spaces
hyperparams = {
'order': [1, 2, 3],
'partitions': np.arange(10,100,3),
'partitioner': [1,2], #GridSearch, EntropySearch, ...
'mf': [1, 2, 3], #Triangular, Trapezoidal and Gaussian
'lags': np.arange(2, 7, 1), # The lag indexes
'alpha': np.arange(.0, .5, .05) #Alpha Cut
}
GridSearch.execute(
hyperparams, #A dictionary containing the search spaces for each hyperparameter
datsetname, #Just the name of your dataset
dataset, #Your time series data (list or np.ndarray 1d)
fts_method=hofts.WeightedHighOrderFTS, # the FTS method you want to optimize [only univariate methods]
window_size=10000, #The length of the data window for the Sliding Window Cross Validation method
train_rate=.9, #The proportion of the data window that will be used for training, the remaining will be used for test
increment_rate=.3, #The sliding increment the Sliding Window Cross Validation method
database_file='hyperopt.db' #A sqlite database that will contain the log of the hyperopt process
)
There is no GridSearch implementation yet for multivariate methods.
from pyFTS.hyperparam import mvfts as deho_mv from pyFTS.models.multivariate import mvfts, wmvfts from pyFTS.models.seasonal.common import DateTime from pyFTS.data import Malaysia
dataset = Malaysia.get_dataframe() dataset['time'] = pd.to_datetime(data["time"], format='%m/%d/%y %I:%M %p')
explanatory_variables =[ {'name': 'Temperature', 'data_label': 'temperature', 'type': 'common'}, {'name': 'Daily', 'data_label': 'time', 'type': 'seasonal', 'seasonality': DateTime.minute_of_day, 'npart': 24 }, {'name': 'Weekly', 'data_label': 'time', 'type': 'seasonal', 'seasonality': DateTime.day_of_week, 'npart': 7 }, {'name': 'Monthly', 'data_label': 'time', 'type': 'seasonal', 'seasonality': DateTime.day_of_month, 'npart': 4 }, {'name': 'Yearly', 'data_label': 'time', 'type': 'seasonal', 'seasonality': DateTime.day_of_year, 'npart': 12 } ]
target_variable = {'name': 'Load', 'data_label': 'load', 'type': 'common'}
deho_mv.random_search( datsetname, #Just the name of your dataset dataset, #Your time series data (pd.DataFrame) npop=200, #Size of population of the RS mgen=70, #Number of iterations of the RS fts_method=wmvfts.WeightedMVFTS, #The multivariate FTS method to optimize variables=explanatory_variables, #The list of exogenous/explanatory variables target_variable=target_variable, #The endogenous/target variable window_size=10000, #The length of the data window for the Sliding Window Cross Validation method train_rate=.9, #The proportion of the data window that will be used for training, the remaining will be used for test increment_rate=.3, #The sliding increment the Sliding Window Cross Validation method )
- **Genetic Algorithm (GA)** is between GS and RS, both in accuracy and computational cost.
from pyFTS.hyperparam import Evolutionary from pyFTS.models import hofts from pyFTS.data import TAIEX
datasetname = 'TAIEX' dataset = TAIEX.get_data()
ret = Evolutionary.execute( datsetname, #Just the name of your dataset dataset, #Your time series data (list or np.ndarray 1d) fts_method=hofts.WeightedHighOrderFTS, # the FTS method you want to optimize [only univariate methods] ngen=30, #Number of generations, the number of iterations of the GA npop=20, #The size of population of the GA psel=0.6, #Probability of selection of the GA pcross=.5, #Probability of crossover of the GA pmut=.3, #Probability of mutation of the GA window_size=10000, #The length of the data window for the Sliding Window Cross Validation method train_rate=.9, #The proportion of the data window that will be used for training, the remaining will be used for test increment_rate=.3, #The sliding increment the Sliding Window Cross Validation method experiments=1, #Number of hyperopt experiments to perform database_file='hyperopt.db' #A sqlite database that will contain the log of the hyperopt process )
Please, do not hesitate to get in touch if you have any questions.
Best regards
Thanks, all those three method work !
after executing hyperparameter optimization, does the model fitted automatically using the best params ? or we need to take the value from the output dict and fit the model ?
Would you mind elaborating more about the dict ? I am confused the values belong to which parameter ? from your code using GA :
Experiment 0 Evaluating initial population 1600098526.9596627 GENERATION 0 1600098526.9596627 WITHOUT IMPROVEMENT 1 GENERATION 1 1600098526.9606583 WITHOUT IMPROVEMENT 2 GENERATION 2 1600098526.9626496 WITHOUT IMPROVEMENT 3 GENERATION 3 1600098526.963645 WITHOUT IMPROVEMENT 4 GENERATION 4 1600098526.9656367 WITHOUT IMPROVEMENT 5 GENERATION 5 1600098526.9666321 WITHOUT IMPROVEMENT 6 GENERATION 6 1600098526.9686234 WITHOUT IMPROVEMENT 7 ('TAIEX', 'Evolutive', 'hofts', None, 1, 3, 2, 40, 0.5, '[2, 6, 7]', 'rmse', inf) ('TAIEX', 'Evolutive', 'hofts', None, 1, 3, 2, 40, 0.5, '[2, 6, 7]', 'size', inf) ('TAIEX', 'Evolutive', 'hofts', None, 1, 3, 2, 40, 0.5, '[2, 6, 7]', 'time', 0.010952949523925781)
below is the return dict :
{'alpha': 0.5, 'f1': inf, 'f2': inf, 'lags': [2, 6, 7], 'mf': 1, 'npart': 40, 'order': 3, 'partitioner': 2, 'rmse': inf, 'size': inf, 'time': 0.010952949523925781}
Hi @ramdhan1989
Using this dictionary you can build a model with this code:
from pyFTS.hyperparam import Evolutionary
model = Evolutionary.phenotype(
dictionary, #the result of the hyperparameter method
train, #The train dataset
fts_method #the FTS method
)
Best regards
well thanks a lot @petroniocandido . does the hyperparams optimization search the best data transformation as well ? such as how many lags for differential ? or may be what kind of transformations is the best for the problem ?
thank you
Hi @petroniocandido , how can I get stable prediction using GA ? every time I run it will result different values. do you have suggestion ?
Hi @petroniocandido , I come back to try using this package. Just want to clarify several things :
how to use Transformation differential into hyperparam optimization ?
using evolutionary, I got rmse "nan". is it good ?
is it possible to use other eval metric ? such as rmsle (root mean sq log error) ?
appreciate for your answers
thank you
I am struggle to find guidance about how to use hyperparam modul such as grid search or evolutionary. anyone can share ?
thank you