RobinloveCode commented 5 years ago

Hi, recently i am working on fuzzy time series. these code really help me a lot. Thanks.

But I have a question about the prediction in test data. In your example codes, forecasts = model1.predict(dataset[train_split:train_split+1]), it turns out you are assuming it is okay to use the true data of the previous day. However, I think we can only use the previous prediction output as the next input to the model. and I wrote the codes below: prev_forecasts = dataset[train_split-1:train_split]

for n in range(test_length):

    new_forecast = model1.predict(prev_forecasts[n:n+1])

    prev_forecasts = np.append(prev_forecasts,new_forecast)

#forecasts = model1.predict(dataset[train_split:train_split+1])
forecasts = prev_forecasts

Unfortunately, the prediction is almost a straight line.

In your implementation for the test data, I think the Naive Forecast will perform the best(since the model has the most recent true data to make prediction.).

And by the way, which folder does the method "predict" (model.predict ) belong to ?

Looking forward to your reply.! Thank you!

petroniocandido commented 5 years ago

Hi, thanks to get in touch!

To answer your questions I need to know which partitioner, how many partitions, which model with which model's parameters did you used on this experiment.

Can you share the code & data for reproduction?

RobinloveCode commented 5 years ago

Thank you for your response.

I use the grid partitioner. the codes are shown below:

from pyFTS.partitioners import Grid, Util as pUtil

fig, ax = plt.subplots(nrows=2, ncols=3, figsize=[20,5])

partitioners = {} partitioners_diff = {}

for count,dataset_name in enumerate(dataset_names): dataset = get_dataset(dataset_name)

partitioner = Grid.GridPartitioner(data=dataset, npart=30)
partitioners[dataset_name] = partitioner
partitioner_diff = Grid.GridPartitioner(data=dataset, npart=30, transformation=tdiff)
partitioners_diff[dataset_name] = partitioner_diff

pUtil.plot_sets(dataset, [partitioner.sets], titles=[dataset_name], axis=ax[0][count])
pUtil.plot_sets(dataset, [partitioner_diff.sets], titles=[''], axis=ax[1][count])

Then I train the model with original data(TAIEX) from pyFTS.models import yu for count,dataset_name in enumerate(dataset_names): dataset = get_dataset(dataset_name)

model1 = chen.ConventionalFTS(partitioner=partitioners[dataset_name])
model1.name=dataset_name
model1.fit(dataset[:train_split], save_model=True, file_path='model1'+dataset_name, order=1)

After that, I want to make multi-steps ahead prediction by making one-step prediction first, then use the previous prediction to make the next-step prediction. The prediction codes are shown below: fig, ax = plt.subplots(nrows=3, ncols=1, figsize=[20,10])

forecasts = []

for count,dataset_name in enumerate(dataset_names): dataset = get_dataset(dataset_name)

ax[count].plot(dataset[train_split:train_split+200])

model1 = cUtil.load_obj('model1'+dataset_name)

prev_forecasts = dataset[train_split-1:train_split]

for n in range(200):

    new_forecast = model1.predict(prev_forecasts[n:n+1])

    prev_forecasts = np.append(prev_forecasts,new_forecast)

#forecasts = model1.predict(dataset[train_split:train_split+1])
forecasts = prev_forecasts

ax[count].plot(forecasts)

ax[count].set_title(dataset_name)

plt.tight_layout()

The results are not good.

So in this library, the one-step prediction is based on the true data of previous day, even in the test process? Am I right?

Looking forward to your reply!

RobinloveCode commented 5 years ago

Hi, thanks to get in touch!

To answer your questions I need to know which partitioner, how many partitions, which model with which model's parameters did you used on this experiment.

Can you share the code & data for reproduction?

Thank you for your response.

I use the grid partitioner. the codes are shown below:

from pyFTS.partitioners import Grid, Util as pUtil

fig, ax = plt.subplots(nrows=2, ncols=3, figsize=[20,5])

partitioners = {} partitioners_diff = {}

for count,dataset_name in enumerate(dataset_names): dataset = get_dataset(dataset_name)

partitioner = Grid.GridPartitioner(data=dataset, npart=30) partitioners[dataset_name] = partitioner partitioner_diff = Grid.GridPartitioner(data=dataset, npart=30, transformation=tdiff) partitioners_diff[dataset_name] = partitioner_diff

pUtil.plot_sets(dataset, [partitioner.sets], titles=[dataset_name], axis=ax[0][count]) pUtil.plot_sets(dataset, [partitioner_diff.sets], titles=[''], axis=ax[1][count]) Then I train the model with original data(TAIEX) from pyFTS.models import yu for count,dataset_name in enumerate(dataset_names): dataset = get_dataset(dataset_name)

model1 = chen.ConventionalFTS(partitioner=partitioners[dataset_name]) model1.name=dataset_name model1.fit(dataset[:train_split], save_model=True, file_path='model1'+dataset_name, order=1) After that, I want to make multi-steps ahead prediction by making one-step prediction first, then use the previous prediction to make the next-step prediction. The prediction codes are shown below: fig, ax = plt.subplots(nrows=3, ncols=1, figsize=[20,10])

forecasts = []

for count,dataset_name in enumerate(dataset_names): dataset = get_dataset(dataset_name)

ax[count].plot(dataset[train_split:train_split+200])

model1 = cUtil.load_obj('model1'+dataset_name)

prev_forecasts = dataset[train_split-1:train_split]

for n in range(200):

new_forecast = model1.predict(prev_forecasts[n:n+1])

prev_forecasts = np.append(prev_forecasts,new_forecast)

forecasts = model1.predict(dataset[train_split:train_split+1])

forecasts = prev_forecasts

ax[count].plot(forecasts)

ax[count].set_title(dataset_name) plt.tight_layout()

The results are not good. image

So in this library, the one-step prediction is based on the true data of previous day, even in the test process? Am I right?

Looking forward to your reply!

petroniocandido commented 5 years ago

Some tips:

The Chen method is old and outdated. Try newer methods ans Yu, Cheng, HOFTS, etc.
The correct way to forecast many steps ahead is model.predict(input_data, steps_ahead=n)
Many steps ahead forecasting will, for long time horizons, decay to the time series expected value (or mean). The mean of word "long" depends on the stationarity and distribution of the data.
I recommend the reading of the second part of the tutorial: https://towardsdatascience.com/a-short-tutorial-on-fuzzy-time-series-part-ii-with-an-case-study-on-solar-energy-bda362ecca6d

PYFTS / pyFTS

Question about the predict in test set. #17

forecasts = model1.predict(dataset[train_split:train_split+1])