PYFTS / pyFTS

An open source library for Fuzzy Time Series in Python
http://pyfts.github.io/pyFTS/
GNU General Public License v3.0
262 stars 54 forks source link

Look ahead bias in performance measure? #25

Open wangtieqiao opened 5 years ago

wangtieqiao commented 5 years ago

I tried run the notebook Chen - ConventionalFTS.ipynb and saw the good results, but I am bit skeptical it is so good... so follow the logic of bchmk.sliding_window_benchmarks code, as well as the plots you are showing, I feel you might comparing predicted T+1 timeseries with current T timeseries,

Be more specific: given test data length L, T[0,1,...L], the model.predict produce a T+1 value for each value supplied. the prediction has same length as the given test data.

However, when you plot, or check the performance, you can not directly compare testdata vs predicted. e.g. in your code you compute Measures.py line 396, rmse(datum, forecast)

The correct measure might should be rmse(dataum[1:], forecast[:-1])

Also for you notebook plot, if you shift the prediction with -1 steps, u will see different plot. It will be similar to most of the timeseries models that show prediction = lag(1) + noise which we hope to overcome.

Let me know if I might misunderstood the code/loigc ...


BTW, nice work. I am still trying it out, hope can prove that I could use it in my project...

petroniocandido commented 5 years ago

Hi!

I will look your question in detail and the code in detail, but a priori the model evaluation use rmse(dataum[model.order:], forecast[:-1]) [for Chen.ConventionalFTS the model.order = 1). If this is not happening, then we have a bug.

We are working to improve this tool and every contribution (code, bug reports, issues, etc) is useful for us! Thanks for your appreciation!