wwrechard / pydlm

A python library for Bayesian time series modeling
BSD 3-Clause "New" or "Revised" License
475 stars 98 forks source link

how to use myDLM.predict to estimate predict_mean, predict_var #11

Closed sun137653577 closed 7 years ago

sun137653577 commented 7 years ago

hi Samuel, DLM is really great works for time series analysis. while I have some troubles in predict, codes are as following, where I want to predict mean and variance at the 11th day:

use 10 days data to predict mean and variance at the 11th day

from pydlm import dlm, trend, seasonality, dynamic, autoReg, longSeason y=[1,2,3,4,5,6,7,8,9,10] myDLM=dlm(np.array(y)) myDLM=myDLM+trend(1,discount=0.9)+seasonality(period=4) myDLM.fit() (temp_obser, temp_var)=myDLM.predict(date=10)

A 'NameError' always raised as 'Prediction can only be made right after the filtered date' , while it works for data=0, i am wondering whether date=9 can be used to predict the 11th day?

Another bothering, how to use myDLM.getMean(filterType='predict'), the results as following: [0.0, 0.5141700157557093, 1.0505510107948781, 1.6023862753925242, 2.167438960958558, 2.7443287976659794, 3.3319788291099885, 3.9294677809751066, 4.535975127609981, 5.150754959923865] while (temp_mean, temp_var)=myDLM.predict(date=9), the temp_mean is 5.773, which is different with the last data 5.150754959923865 in getMean? I wondering is getMean(filterType='predict') can get the predict mean as myDLM.predict? thanks for your kind help.

Jianju

wwrechard commented 7 years ago

Hi sun137653577,

pydlm indexes the data starting from 0 instead of 1, i.e, the first day will be day 0 instead of day 1 and the 10th day will be day 9. So if you want to get the predicted value for 11th day, you should use

(temp_obser, temp_var) = myDLM.predict(date=9)

Similar for the second question, if you look at

myDLM.predict(date=8)

it should give you 5.15075. So to get the matched value you should do

predicted_mean = [myDLM.predict(date=i)[0][0,0] for i in range(10)]

Notice that in predicted_mean, we don't have the predicted value for the first day (day 0). This is because myDLM.predict(date=i) only predicts the next day given day i and there is no day -1 to use for predicting day 0. In myDLM.getMean(filterType='predict'), we just use the prior value as the predict value for day 0.

By the way, can you make sure you are using the 0.1.1.8 version? I ran the same code in your first paragraph but got values different from your second paragraph.

sun137653577 commented 7 years ago

gotcha, many thanks for your patient explanation. e.... those codes were based on master version, which contains function predictN.

wwrechard commented 7 years ago

I also tried the github version with the following code

from pydlm import dlm, trend, seasonality, dynamic, autoReg, longSeason
y=[1,2,3,4,5,6,7,8,9,10]
myDLM=dlm(y)
myDLM=myDLM+trend(1,discount=0.9)+seasonality(period=4)
myDLM.fit()
myDLM.getMean(filterType='predict')

This is what I got

[0.0, 0.4594594348721708, 0.9737126063727719, 1.500903030432554, 1.2226220155493328, 3.6881416524754216, 4.340895830228238, 4.652402192752849, 5.897245599083117, 7.045145225352235]
sun137653577 commented 7 years ago

i didn't install master verion since can't install it directly,so i just copied pydlm files to code folder.

wwrechard commented 7 years ago

Hi, I also tried this way (download the file and use without installation) but still cannot reproduce your values. Please make sure there isn't anything else wrong with your setup.

sun137653577 commented 7 years ago

okay. i will double check the installation package. thanks.

wwrechard commented 7 years ago

Great and I'll close this issue. By the way, if you want to model a linear trend, better set the degree to be 2, as degree=1 stands for a constant trend, degree=2 stands for a linear trend, degree=3 stands for a locally quadratic curve and so on.

Thanks.

sun137653577 commented 7 years ago

that means degree=2 is the same as dlmModPoly(order = 1) under other patameters are the same?

wwrechard commented 7 years ago

Right. I think I shall change the terminology and the indexing in next version so as to avoid any confusion.

sun137653577 commented 7 years ago

thanks.

akuppam commented 6 years ago

Hi @wwrechard, I am running pydlm for time series with regressors. I have daily data for about 1000 days (starting from 2016/1/1 till 2018/9/30). I was able to build the model with regressors and check the model performance by predicting using model.plotPredictN(N=n, date=d). However, it looks like (n+d) should be <= (total no of samples).

How can I use the model to predict 200, 400, 800, 1000 days ahead ? That is, 2018/10/1 onwards till the end of year 2019 and 2020.

Thanks.