Modeling - Githubissues

JestonBlu commented 8 years ago

Use this thread to discuss modeling and forecasting

JestonBlu commented 8 years ago

I added a new file to RScripts/ and that imports the raw data, centers a few of the variables, and exports a .Rda file to Data/... whoever is going to be playing with various models should just be able to use the command below and it will import the data set with no missing information (Jan 1993 - Oct 2015). Feel free to do some of your own transformations, i just put it there to get started.

load("Data/Data_Prep.rds")

JestonBlu commented 8 years ago

Travis proposed model from the presentation discussion

We can basically follow the steps of Example 3.46 in the text.

I took the second difference (d = 2), as Joseph suggested, then the first seasonal difference (D = 1) with s = 12 (this is common for monthly economic data), as the book did. Below is the plot of the transformed graphed. It looks pretty stationary (not perfect, but adequate), and we can confirm this with the ADF test (it's cited in other time series texts, but I haven't seen it in ours yet).

After that, the book suggests that you examine the ACF and PACF plots.

First, the book says to look at the seasonal changes in ACF and PACF (h = 12, 24, 36, ...). These seem to indicate that the ACF trails off, and the PACF cuts off after one year (h = 12). This suggests that we let P = 1 and Q = 0.

Next, the book says to look at the ACF and PACF within only the first season (h = 1, 2, ..., 12). The PACF declines slowly, but the ACF cuts off after 1, suggesting we let p = 0, and q = 1.

When we put this all together, we get a (S)ARIMA(0, 2, 1) x (1, 1, 0) with s = 12 model. I fit that model, and here are the diagnostic plots:

And here are the parameter estimates:

Coefficients: ma1 sar1 constant -0.8322 -0.4868 0.0348 s.e. 0.0355 0.0536 6.7411

sigma^2 estimated as 0.03703: log likelihood = 54.87, aic = -101.75

Overall, I think the diagnostics look good. The standardized residual plot isn't great, but it isn't terrible. The normal Q-Q and ACF of residuals look pretty good. The p-values for Ljung-Box are not amazing, but they are at least above the line until H = 20, which I believe the book says is a decent cutoff.

Also, here is the mathematical representation of our model, which we will probably need at some point.

Please check everything I did. I am sure I messed up somewhere. I will post my R code and plots so everyone can check. Also, we should come up with a couple of other models to test, so if you interpret the ACF and PACF differently, that is great. I am not intending this to be the final model, just a starting point. Hopefully, some of this can go toward next week's presentation.

I am happy to add things to Overleaf, but I don't know how, and I am only familiar with basic Latex in the context of Word's Equation Editor.

bopangpsy commented 8 years ago

Travis, thank you for building this model which can be the foundation for our further work.

I build two additional models and here are the results.

From looking at the diagnostic results from the model proposed by Travis, I feel like there might still be some non-stationarity after second order difference. So I also tried third order difference, but it didn't work well since it led to more variability. Thus I also stick to second order difference.

I then look at the ACF and PACF plots to build models. First, we can look into the seasonal pattern. The ACF seems to tail off and the PACF cuts off at either 1 or 3. Together, the plots suggest AR(1) or AR (3).

We then can inspect the plots at the within season lags, h=1,..,11. One perspective is that the ACF cuts off after 1 and the PACF tails off, and it indicates MA(1). Another perspective is that the ACF cuts off after 1 and the PACF cuts off after 4. In this situation, the book suggests to build a SARMA of orders p = 4 and q = 1. However, our professor points out that this is a bad reasoning. It is still however tempting for me to try this model since I lean to the side that the PACF cuts off after lag 4 instead of tails off (some subjective feeling goes in here). We right now don't know how to handle the situation where both ACF and PACF cuts off at a certain lag other than the approach mentioned in the book. So I tried this model. In diagnostic procedures, as you will see, it works better at some criterion. Together, I proposed two additional models and compare them to the one proposed by Travis.

model1 <- sarima(unem, p = 0, d = 2, q = 1, P = 1, D = 1, Q = 0, S = 12) model2 <- sarima(unem, p = 0, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12) model3 <- sarima(unem, p = 4, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12)

model1$AIC; model1$BIC [1] -2.283108 [1] -3.244063

model2$AIC; model2$BIC [1] -2.444084 [1] -3.379008

model3$AIC; model3$BIC [1] -2.44496 [1] -3.327824

From looking at AIC and BIC values, Models 2 and 3 perform quite similarly, which both show some slight evidence of outperforming Model1.

Then we could compare the three models based on diagnostic plots. The standardized residuals of all models show some evidence of non-white-noise. ARMA models do not model variability. We will have a few lectures on this topic. There is not much we can do now on this issue.

In the ACF of residuals of Model 1 shows a spike at lag 24. The other two models do not show such a spike.

The normal plots from the three models are fairly similar.

The Q-statistic or Ljung-Box statistic Models 1 and 2 have similar results. Model 1 seems to perform better at the first few lags, but Model 2 does better after lag 15. Model 3 clearly perform better than the two models on the Q-statistic. Since the Model 3 is based on a reasoning our professor does not like, we may not present this model. However it at least informs us that some models based on the thought that both the ACF and PACF cuts off at certain lags might model our data better. I am not quite sure how to handle this situation. Any thoughts on this would be highly appreciated!

Model 1

Model 2

Model 3

pakarshan commented 8 years ago

While going through a bunch of models, the following model seems most appropriate as noted by everyone.

sarima(econ[,2],0,2,1,1,1,0,12) with the following diagnostics:

[image: Inline image 2]

The adf test also suggests stationarity as follows:

[image: Inline image 3] Also, I am working on other predictor variables to develop a preliminary regression model.

Regards, AP

On Fri, Jul 1, 2016 at 11:31 PM, bopangpsy notifications@github.com wrote:

Travis, thank you for building this model which can be the foundation for our further work.

I build two additional models and here are the results.

From looking at the diagnostic results from the model proposed by Travis, I feel like there might still be some non-stationarity after second order difference. So I also tried third order difference, but it didn't work well since it led to more variability. Thus I also stick to second order difference.

[image: image] https://cloud.githubusercontent.com/assets/10681978/16538147/5601e5ce-3fe0-11e6-9474-da0283cc2466.png

I then look at the ACF and PACF plots to build models. First, we can look into the seasonal pattern. The ACF seems to tail off and the PACF cuts off at either 1 or 3. Together, the plots suggest AR(1) or AR (3).

We then can inspect the plots at the within season lags, h=1,..,11. One perspective is that the ACF cuts off after 1 and the PACF tails off, and it indicates MA(1). Another perspective is that the ACF cuts off after 1 and the PACF cuts off after 4. In this situation, the book suggests to build a SARMA of orders p = 4 and q = 1. However, our professor points out that this is a bad reasoning. It is still however tempting for me to try this model since I lean to the side that the PACF cuts off after lag 4 instead of tails off (some subjective feeling goes in here). We right now don't know how to handle the situation where both ACF and PACF cuts off at a certain lag other than the approach mentioned in the book. So I tried this model. In diagnostic procedures, as you will see, it works better at some criterion. Together, I proposed two additional models and compare them to the one proposed by Travis.

[image: image] https://cloud.githubusercontent.com/assets/10681978/16538161/c812406e-3fe0-11e6-8967-10837b874289.png

model1 <- sarima(unem, p = 0, d = 2, q = 1, P = 1, D = 1, Q = 0, S = 12) model2 <- sarima(unem, p = 0, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12) model3 <- sarima(unem, p = 4, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12)

model1$AIC; model1$BIC [1] -2.283108 [1] -3.244063

model2$AIC; model2$BIC [1] -2.444084 [1] -3.379008

model3$AIC; model3$BIC [1] -2.44496 [1] -3.327824

From looking at AIC and BIC values, Models 2 and 3 perform quite similarly, which both show some slight evidence of outperforming Model1.

Then we could compare the three models based on diagnostic plots. The standardized residuals of all models show some evidence of non-white-noise. ARMA models do not model variability. We will have a few lectures on this topic. There is not much we can do now on this issue.

In the ACF of residuals of Model 1 shows a spike at lag 24. The other two models do not show such a spike.

The normal plots from the three models are fairly similar.

The Q-statistic or Ljung-Box statistic Models 1 and 2 have similar results. Model 1 seems to perform better at the first few lags, but Model 2 does better after lag 15. Model 3 clearly perform better than the two models on the Q-statistic. Since the Model 3 is based on a reasoning our professor does not like, we may not present this model. However it at least informs us that some models based on the thought that both the ACF and PACF cuts off at certain lags might model our data better. I am not quite sure how to handle this situation. Any thoughts on this would be highly appreciated!

Model 1 [image: image] https://cloud.githubusercontent.com/assets/10681978/16538220/1c8b6b6e-3fe3-11e6-9f3b-83f6704863b7.png

Model 2 [image: image] https://cloud.githubusercontent.com/assets/10681978/16538221/2a40d064-3fe3-11e6-8a2b-3b6f80660f1f.png

Model 3 [image: image] https://cloud.githubusercontent.com/assets/10681978/16538225/42dbdaf6-3fe3-11e6-89fc-e5f994ab5301.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230082959, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZ1wsnq8q0K_cq9UCJAjUzXUxAqlOks5qRembgaJpZM4Izj8R .

pakarshan commented 8 years ago

Hi,

I am trying to fit the regression model here and was wondering if I should take the stationary data for my fitting? Any help on this would be appreciated.

Thanks On Jul 2, 2016 10:38, "akarshan puri" p.akarshan@gmail.com wrote:

While going through a bunch of models, the following model seems most appropriate as noted by everyone.

sarima(econ[,2],0,2,1,1,1,0,12) with the following diagnostics:

[image: Inline image 2]

The adf test also suggests stationarity as follows:

[image: Inline image 3] Also, I am working on other predictor variables to develop a preliminary regression model.

Regards, AP

On Fri, Jul 1, 2016 at 11:31 PM, bopangpsy notifications@github.com wrote:

Travis, thank you for building this model which can be the foundation for our further work.

I build two additional models and here are the results.

From looking at the diagnostic results from the model proposed by Travis, I feel like there might still be some non-stationarity after second order difference. So I also tried third order difference, but it didn't work well since it led to more variability. Thus I also stick to second order difference.

[image: image] https://cloud.githubusercontent.com/assets/10681978/16538147/5601e5ce-3fe0-11e6-9474-da0283cc2466.png

I then look at the ACF and PACF plots to build models. First, we can look into the seasonal pattern. The ACF seems to tail off and the PACF cuts off at either 1 or 3. Together, the plots suggest AR(1) or AR (3).

We then can inspect the plots at the within season lags, h=1,..,11. One perspective is that the ACF cuts off after 1 and the PACF tails off, and it indicates MA(1). Another perspective is that the ACF cuts off after 1 and the PACF cuts off after 4. In this situation, the book suggests to build a SARMA of orders p = 4 and q = 1. However, our professor points out that this is a bad reasoning. It is still however tempting for me to try this model since I lean to the side that the PACF cuts off after lag 4 instead of tails off (some subjective feeling goes in here). We right now don't know how to handle the situation where both ACF and PACF cuts off at a certain lag other than the approach mentioned in the book. So I tried this model. In diagnostic procedures, as you will see, it works better at some criterion. Together, I proposed two additional models and compare them to the one proposed by Travis.

[image: image] https://cloud.githubusercontent.com/assets/10681978/16538161/c812406e-3fe0-11e6-8967-10837b874289.png

model1 <- sarima(unem, p = 0, d = 2, q = 1, P = 1, D = 1, Q = 0, S = 12) model2 <- sarima(unem, p = 0, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12) model3 <- sarima(unem, p = 4, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12)

model1$AIC; model1$BIC [1] -2.283108 [1] -3.244063

model2$AIC; model2$BIC [1] -2.444084 [1] -3.379008

model3$AIC; model3$BIC [1] -2.44496 [1] -3.327824

From looking at AIC and BIC values, Models 2 and 3 perform quite similarly, which both show some slight evidence of outperforming Model1.

Then we could compare the three models based on diagnostic plots. The standardized residuals of all models show some evidence of non-white-noise. ARMA models do not model variability. We will have a few lectures on this topic. There is not much we can do now on this issue.

In the ACF of residuals of Model 1 shows a spike at lag 24. The other two models do not show such a spike.

The normal plots from the three models are fairly similar.

The Q-statistic or Ljung-Box statistic Models 1 and 2 have similar results. Model 1 seems to perform better at the first few lags, but Model 2 does better after lag 15. Model 3 clearly perform better than the two models on the Q-statistic. Since the Model 3 is based on a reasoning our professor does not like, we may not present this model. However it at least informs us that some models based on the thought that both the ACF and PACF cuts off at certain lags might model our data better. I am not quite sure how to handle this situation. Any thoughts on this would be highly appreciated!

Model 1 [image: image] https://cloud.githubusercontent.com/assets/10681978/16538220/1c8b6b6e-3fe3-11e6-9f3b-83f6704863b7.png

Model 2 [image: image] https://cloud.githubusercontent.com/assets/10681978/16538221/2a40d064-3fe3-11e6-8a2b-3b6f80660f1f.png

Model 3 [image: image] https://cloud.githubusercontent.com/assets/10681978/16538225/42dbdaf6-3fe3-11e6-89fc-e5f994ab5301.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230082959, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZ1wsnq8q0K_cq9UCJAjUzXUxAqlOks5qRembgaJpZM4Izj8R .

JestonBlu commented 8 years ago

I believe you would need to use the stationary non-seasonal unemployment rate as your response variable. I do not believe your predictor variables have to be stationary, but it would probably make sense to at least take the seasonality out of them.

JestonBlu commented 8 years ago

Also can you guys commit your code? If you dont know how let me know and I will help you through it... once everyone starts posting code with their models I will start compiling all of them into a single script so we can easily compare with some graphs

pakarshan commented 8 years ago

I can provide my code by tonight or tomorrow morning if thats okay? On Jul 3, 2016 08:41, "Joseph Blubaugh" notifications@github.com wrote:

Also can you guys commit your code? If you dont know how let me know and I will help you through it... once everyone starts posting code with their models I will start compiling all of them into a single script so we can easily compare with some graphs

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230154045, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZ9O2O-v3pdALzq2k0gqfZhEsufVFks5qR7wNgaJpZM4Izj8R .

JestonBlu commented 8 years ago

Yeah, no problem.

sheltonmath commented 8 years ago

Sounds good.

On Sun, Jul 3, 2016 at 7:19 AM, Joseph Blubaugh notifications@github.com wrote:

Yeah, no problem.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230155694&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=zo1-LG1ER0SO2d0O8-gpqgeSlvMTsrjdtdbTlhgBZZw&s=9tzg5z-IOi735ttf1K5cHX2dwcSlBz1zCO3r7kF5jAw&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK04ADJPPROQBD0a7UdLv-2DcEJeVTkCks5qR8UPgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=zo1-LG1ER0SO2d0O8-gpqgeSlvMTsrjdtdbTlhgBZZw&s=93WIovSkjZE8aBEjFTBLThP7wV3PRiEEH5G6q-_WhRM&e= .

SZRoberson commented 8 years ago

Checking both residual plots shows that there is a drastic drop that is likely indicative of the 2008 recession. Can we add weighting or something to fix this?

JestonBlu commented 8 years ago

In the political dataset I have an indicator variable for recession by month we could try using that.

pakarshan commented 8 years ago

PFA

Regards, Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh notifications@github.com wrote:

In the political dataset I have an indicator variable for recession by month we could try using that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230169698, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R .

sheltonmath commented 8 years ago

I'll go ahead and start putting together what we have so far on Overleaf this evening, if that's OK with everyone.

On Mon, Jul 4, 2016 at 6:47 AM, pakarshan notifications@github.com wrote:

PFA

Regards, Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh notifications@github.com wrote:

In the political dataset I have an indicator variable for recession by month we could try using that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230169698 , or mute the thread < https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230294606&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=-kkkXhdJIn3kcydf1Cv69gjXWtcOX-j-ndsVFjDsoc8&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0810T2FOfnzaF2zK5twHDp6CuqBtks5qSQ7ogaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=L_d9bPqMUyPDHDrChhJl57btQ-m9Z0tqBb7oz0fXvTU&e= .

pakarshan commented 8 years ago

Sounds good. Letme know if you require the model parameter details and the corresponding AIac values. I did not include those in the attachment but i provided the code.

Regards, Akki On Jul 5, 2016 07:57, "Alison" notifications@github.com wrote:

I'll go ahead and start putting together what we have so far on Overleaf this evening, if that's OK with everyone.

On Mon, Jul 4, 2016 at 6:47 AM, pakarshan notifications@github.com wrote:

PFA

Regards, Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh < notifications@github.com> wrote:

In the political dataset I have an indicator variable for recession by month we could try using that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230169698

, or mute the thread <

https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230294606&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=-kkkXhdJIn3kcydf1Cv69gjXWtcOX-j-ndsVFjDsoc8&e= , or mute the thread < https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0810T2FOfnzaF2zK5twHDp6CuqBtks5qSQ7ogaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=L_d9bPqMUyPDHDrChhJl57btQ-m9Z0tqBb7oz0fXvTU&e=

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230470455, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZ3B3wd2-Dh1f3E385zN8gGcML709ks5qSlSzgaJpZM4Izj8R .

JestonBlu commented 8 years ago

I am still unclear about the expectations for this presentation. Does anyone know if we are actually supposed to present a model this week? It looks to me like unless there is a class Thursday, that we wont present until next Monday.

On Tue, Jul 5, 2016 at 8:00 AM, pakarshan notifications@github.com wrote:

Sounds good. Letme know if you require the model parameter details and the corresponding AIac values. I did not include those in the attachment but i provided the code.

Regards, Akki

On Jul 5, 2016 07:57, "Alison" notifications@github.com wrote:

I'll go ahead and start putting together what we have so far on Overleaf this evening, if that's OK with everyone.

On Mon, Jul 4, 2016 at 6:47 AM, pakarshan notifications@github.com wrote:

PFA

Regards, Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh < notifications@github.com> wrote:

In the political dataset I have an indicator variable for recession by month we could try using that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230169698

, or mute the thread <

https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230294606&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=-kkkXhdJIn3kcydf1Cv69gjXWtcOX-j-ndsVFjDsoc8&e=

, or mute the thread <

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0810T2FOfnzaF2zK5twHDp6CuqBtks5qSQ7ogaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=L_d9bPqMUyPDHDrChhJl57btQ-m9Z0tqBb7oz0fXvTU&e=

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230470455 , or mute the thread < https://github.com/notifications/unsubscribe/AS9ZZ3B3wd2-Dh1f3E385zN8gGcML709ks5qSlSzgaJpZM4Izj8R

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230471221, or mute the thread https://github.com/notifications/unsubscribe/ADL2hdlKkhW98_vWLJAtVWnVcaaTHbV1ks5qSlV2gaJpZM4Izj8R .

JestonBlu commented 8 years ago

@bopangpsy @pakarshan can you commit the code for your models to the /RScripts folder?

pakarshan commented 8 years ago

Sure .. code is in the word doc I attached..ill put it in github folder too On Jul 5, 2016 10:48, "Joseph Blubaugh" notifications@github.com wrote:

@bopangpsy https://github.com/bopangpsy @pakarshan https://github.com/pakarshan can you commit the code for your models to the /RScripts folder?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230518553, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZ-MIcXCOoxcqMGDoY0KYDwX8gezjks5qSnzOgaJpZM4Izj8R .

JestonBlu commented 8 years ago

@bopangpsy @pakarshan okay, im guessing you dropped it into your local folder, but you didnt actually commit and push your changes.. do you know how to do this?

bopangpsy commented 8 years ago

Sure, I'll add that soon.

Regarding the expectations for presentation, the professor has not mentioned yet. However, in the last two lectures (14 and 15), he talked a lot applied examples about model building. I'd assume that our presentation would be something similar to what he talked in the two lectures. Basically, it's the model building process. How do we preprocess our data to obtain a stationary process (difference order 2 and difference order 1 in our case)? How do we identify the model (based on ACF and PACF)? What is the set of candidate models? How do you choose the best one (AIC, BIC, diagnostics)? I guess we might not need to present a regression model at this stage since he hasn't talked much about it. How do you guys think about this?

On Tue, Jul 5, 2016 at 10:48 AM, Joseph Blubaugh notifications@github.com wrote:

@bopangpsy https://github.com/bopangpsy @pakarshan https://github.com/pakarshan can you commit the code for your models to the /RScripts folder?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230518553, or mute the thread https://github.com/notifications/unsubscribe/AKL-elgQ4F-Jm4mU_2s9t99bx7ZU-Mruks5qSnzOgaJpZM4Izj8R .

SZRoberson commented 8 years ago

"Best model," as far as I know, is pretty ambiguous right now. With what I have done before, I checked AIC and BIC (not really thinking about using R-squared for the time being). The model identification from P/ACF is outlined in the text by checking out the tail behavior to see if it decays asymptotically or cuts off. We should be checking inside the band for "cutoff" behavior.

I actually missed today's live lecture since I had an engineering final to take; I'll relay other questions to him tomorrow.

bopangpsy commented 8 years ago

Sure, it's always hard to call a model "Best". I think in presentations, we may present several potential candidate models, and compare them from several perspectives. Hopefully, one model will gain relatively more evidence.

I just uploaded my code for these preliminary models I played with.

On Tue, Jul 5, 2016 at 5:43 PM, Sean Roberson notifications@github.com wrote:

"Best model," as far as I know, is pretty ambiguous right now. With what I have done before, I checked AIC and BIC (not really thinking about using R-squared for the time being). The model identification from P/ACF is outlined in the text by checking out the tail behavior to see if it decays asymptotically or cuts off. We should be checking inside the band for "cutoff" behavior.

I actually missed today's live lecture since I had an engineering final to take; I'll relay other questions to him tomorrow.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230624118, or mute the thread https://github.com/notifications/unsubscribe/AKL-eqrVP3BjdyVc3aNptLUGn4A85HY9ks5qSt32gaJpZM4Izj8R .

sheltonmath commented 8 years ago

The code is good enough. Thank you.

On Tue, Jul 5, 2016 at 6:00 AM, pakarshan notifications@github.com wrote:

Sounds good. Letme know if you require the model parameter details and the corresponding AIac values. I did not include those in the attachment but i provided the code.

Regards, Akki On Jul 5, 2016 07:57, "Alison" notifications@github.com wrote:

I'll go ahead and start putting together what we have so far on Overleaf this evening, if that's OK with everyone.

On Mon, Jul 4, 2016 at 6:47 AM, pakarshan notifications@github.com wrote:

PFA

Regards, Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh < notifications@github.com> wrote:

In the political dataset I have an indicator variable for recession by month we could try using that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230169698

, or mute the thread <

https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230294606&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=-kkkXhdJIn3kcydf1Cv69gjXWtcOX-j-ndsVFjDsoc8&e=

, or mute the thread <

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0810T2FOfnzaF2zK5twHDp6CuqBtks5qSQ7ogaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=L_d9bPqMUyPDHDrChhJl57btQ-m9Z0tqBb7oz0fXvTU&e=

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230470455 , or mute the thread < https://github.com/notifications/unsubscribe/AS9ZZ3B3wd2-Dh1f3E385zN8gGcML709ks5qSlSzgaJpZM4Izj8R

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230471221&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8YUBE9adqnelnpMPCShocUwXBeTK_nRuGaBuCcOIzf4&s=SqpmtH57NhZLyal0zHf6LptPeQD_vvyP0HzbLkSoCuY&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK01-5Ftll-2D7NTdaYjHVEdvn79gLctDJks5qSlV2gaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8YUBE9adqnelnpMPCShocUwXBeTK_nRuGaBuCcOIzf4&s=UglRdZwyYdII8KxoAMTbHiUndRH8NssERaSiyyNI8L0&e= .

JestonBlu commented 8 years ago

I would like to propose an additional model. I have gone through the same exercise as @trlilley12 and @bopangpsy only I used the seasonally adjusted unemployment rate. It looks like the performance is definitely comparable to the seasonal models. I used sarima for the nice diagnostic plot it creates, but I left the seasonal parameters out.

I committed a script here RScripts/seasonally_adjusted.R

I get an AICc = -2.672 and BIC = -3.565

non-seasonal-diagnostic

#### Model Comparison
## 
## Model 1: {AIC: -2.617} {BIC: -3.578} *** Best BIC
## Model 2: {AIC: -2.613} {BIC: -3.495}
## Model 3: {AIC: -2.672} {BIC: -3.565} *** Best AIC
##
#### Model 3 Pvalues
##
##                             Estimate     SE  t.value Pvalue
## ar1                          -0.2176 0.0672  -3.2387   .001 ***
## ma1                          -0.8835 0.0411 -21.4938  <.001 ***
## intercept                     0.0001 0.0009   0.1447   .886
## industrial_production_sa     -0.0500 0.0132  -3.7763  <.001 ***
## manufacturers_new_orders_sa  -0.0005 0.0007  -0.6516   .523
## house_price_sa               -0.0413 0.0122  -3.3765  <.001 ***
## construction_spend_sa         0.0120 0.0067   1.7902   .091 
## retail_sales_sa               0.0027 0.0013   2.1645   .044 ***

bopangpsy commented 8 years ago

Cool, Joseph! This model is simple and performs pretty well in terms of both fitting indices and diagnostics.

On Wed, Jul 6, 2016 at 6:50 PM, Joseph Blubaugh notifications@github.com wrote:

I would like to propose an additional model. I have gone through the same exercise as @trlilley12 https://github.com/trlilley12 and @bopangpsy https://github.com/bopangpsy only I used the seasonally adjusted unemployment rate. It looks like the performance is definitely comparable to the seasonal models. I used sarima for the nice diagnostic plot it creates, but I left the seasonal parameters out.

I committed a script here RScripts/seasonally_adjusted.R

I get an AICc = -2.617631 and BIC = -3.598962

[image: non-seasonal-diagnostic] https://cloud.githubusercontent.com/assets/3339909/16638083/f162e2c6-43a9-11e6-99d4-2faae485bbe1.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230941399, or mute the thread https://github.com/notifications/unsubscribe/AKL-eq5rQpzxcA9RHD-w40Ad0Bazcuzpks5qTD9dgaJpZM4Izj8R .

JestonBlu commented 8 years ago

Thanks, one thing I am wondering about in the preliminary models you guys created is in the differrencing... im wondering what the impact is of doing one or two differences and then doing a 12 lag difference.. that may make interpretation a little difficult... do you guys have any references or thoughts for going about differencing that way? Did you try doing the lag difference first? Maybe like this:

diff(diff(unem, lag = 12), differences = 2)

pakarshan commented 8 years ago

Even I have used the seasonally adjusted unemp rate while considering the models. Also , I have posted my script on github. On Jul 6, 2016 18:50, "Joseph Blubaugh" notifications@github.com wrote:

I would like to propose an additional model. I have gone through the same exercise as @trlilley12 https://github.com/trlilley12 and @bopangpsy https://github.com/bopangpsy only I used the seasonally adjusted unemployment rate. It looks like the performance is definitely comparable to the seasonal models. I used sarima for the nice diagnostic plot it creates, but I left the seasonal parameters out.

I committed a script here RScripts/seasonally_adjusted.R

I get an AICc = -2.617631 and BIC = -3.598962

[image: non-seasonal-diagnostic] https://cloud.githubusercontent.com/assets/3339909/16638083/f162e2c6-43a9-11e6-99d4-2faae485bbe1.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230941399, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZ8md8tVWPjF9buWTA0X8zJDOyr4tks5qTD9dgaJpZM4Izj8R .

JestonBlu commented 8 years ago

@pakarshan im not seeing your code.. you have to commit and push for it to show up for the rest of us. If you have done that can you specify which script file you are referring to?

pakarshan commented 8 years ago

Model Fitting.R is the name. I put it online using upload feature. On Jul 6, 2016 20:56, "Joseph Blubaugh" notifications@github.com wrote:

@pakarshan https://github.com/pakarshan im not seeing your code.. you have to commit and push for it to show up for the rest of us. If you have done that can you specify which script file you are referring to?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-230958654, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZy9Y7KbXrM0e84BDpQE2q44XPjZKks5qTFzjgaJpZM4Izj8R .

JestonBlu commented 8 years ago

Okay, i see it... can you explain the thought behind fitting a seasonal parameter to the seasonally adjusted data? I just switched it to 0 but it looks like it doesnt make a difference in the output. Also using the additional variables looks like it does improve the model slightly. Did you try playing with the lags to see if any of the explanatory variables can be used as leading variables? As a side note, in my last commit i added a recession indicator.. if you use load("Data/data_prep.rda") then you shouldnt have to do all of the data prep in your first several steps.

JestonBlu commented 8 years ago

I have created a script RScripts/All_Final_Models.R to combine everyone's currently proposed models into a single place. I grouped them by seasonal vs seasonally adjusted data and created this table to show the model differences and relative performance. I also have the latex equivalent pasted below in case we want to put that into beamer (hopefully its compatible). I would still like to see us play with the additional variables a bit and see if we can find the appropriate lags to improve the models further since sarima allows you to easily include them.

Please take a look and let me know what you think. There are a few plots in the code which we can use for the presentation, but feel free to add more if you think we are missing something. We do probably need a few more.

Data	Model	Order	Seasonal.Order	Xregs	AIC	BIC
Unem	Mdl.1	0,2,1	1,1,0	N	-2.274336	-3.234984
Unem	Mdl.2	0,2,1	3,1,0	N	-2.435558	-3.369972
Unem	Mdl.3	4,2,1	3,1,0	N	-2.437146	-3.319090
Unem.sa	Mdl.4	0,2,1	1,0,0	N	-2.606460	-3.580226
Unem.sa	Mdl.5	1,2,1	NA	N	-2.625197	-3.598962
Unem.sa	Mdl.6	0,2,1	1,0,0	Y	-2.576905	-3.485083
Unem.sa	Mdl.7	1,2,1	NA	Y	-2.595392	-3.490453

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Thu Jul  7 17:22:39 2016
\begin{table}[ht]
\centering
\begin{tabular}{rlllllrr}
  \hline
 & Data & Model & Order & Seasonal.Order & Xregs & AIC & BIC \\ 
  \hline
1 & Unem & Mdl.1 & 0,2,1 & 1,1,0 & N & -2.27 & -3.23 \\ 
  2 & Unem & Mdl.2 & 0,2,1 & 3,1,0 & N & -2.44 & -3.37 \\ 
  3 & Unem & Mdl.3 & 4,2,1 & 3,1,0 & N & -2.44 & -3.32 \\ 
  4 & Unem.sa & Mdl.4 & 0,2,1 & 1,0,0 & N & -2.61 & -3.58 \\ 
  5 & Unem.sa & Mdl.5 & 1,2,1 &  & N & -2.63 & -3.60 \\ 
  6 & Unem.sa & Mdl.6 & 0,2,1 & 1,0,0 & Y & -2.58 & -3.49 \\ 
  7 & Unem.sa & Mdl.7 & 1,2,1 &  & Y & -2.60 & -3.49 \\ 
   \hline
\end{tabular}
\end{table}

SZRoberson commented 8 years ago

We can probably stop here for now. None of the groups yet considered seasonal adjustments, and 7 models seems a bit much at this stage.

I think I'll probably talk about the three best based on AIC, but I want to be sure that's okay before I write presentation notes for myself. On Jul 7, 2016 5:28 PM, "Joseph Blubaugh" notifications@github.com wrote:

I have created a script RScripts/All_Final_Models.R to combine everyone's currently proposed models into a single place. I grouped them by seasonal vs seasonally adjusted data and created this table to show the model differences and relative performance. I also have the latex equivalent pasted below in case we want to put that into beamer (hopefully its compatible). I would still like to see us play with the additional variables a bit and see if we can find the appropriate lags to improve the models further since sarima allows you to easily include them.

Please take a look and let me know what you think. There are a few plots in the code which we can use for the presentation, but feel free to add more if you think we are missing something. We do probably need a few more. Data Model Order Seasonal.Order Xregs AIC BIC Unem Mdl.1 0,2,1 1,1,0 N -2.274336 -3.234984 Unem Mdl.2 0,2,1 3,1,0 N -2.435558 -3.369972 Unem Mdl.3 4,2,1 3,1,0 N -2.437146 -3.319090 Unem.sa Mdl.4 0,2,1 1,0,0 N -2.606460 -3.580226 Unem.sa Mdl.5 1,2,1 NA N -2.625197 -3.598962 Unem.sa Mdl.6 0,2,1 1,0,0 Y -2.576905 -3.485083 Unem.sa Mdl.7 1,2,1 NA Y -2.595392 -3.490453

% latex table generated in R 3.3.1 by xtable 1.8-2 package% Thu Jul 7 17:22:39 2016\begin{table}[ht]\centering\begin{tabular}{rlllllrr} \hline & Data & Model & Order & Seasonal.Order & Xregs & AIC & BIC \ \hline 1 & Unem & Mdl.1 & 0,2,1 & 1,1,0 & N & -2.27 & -3.23 \ 2 & Unem & Mdl.2 & 0,2,1 & 3,1,0 & N & -2.44 & -3.37 \ 3 & Unem & Mdl.3 & 4,2,1 & 3,1,0 & N & -2.44 & -3.32 \ 4 & Unem.sa & Mdl.4 & 0,2,1 & 1,0,0 & N & -2.61 & -3.58 \ 5 & Unem.sa & Mdl.5 & 1,2,1 & & N & -2.63 & -3.60 \ 6 & Unem.sa & Mdl.6 & 0,2,1 & 1,0,0 & Y & -2.58 & -3.49 \ 7 & Unem.sa & Mdl.7 & 1,2,1 & & Y & -2.60 & -3.49 \ \hline\end{tabular}\end{table}

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-231226942, or mute the thread https://github.com/notifications/unsubscribe/AS8he4jWc5C3nlX_r1DA37gosBzz_3TBks5qTX2mgaJpZM4Izj8R .

pakarshan commented 8 years ago

Looks great guys! On Jul 7, 2016 17:28, "Joseph Blubaugh" notifications@github.com wrote:

I have created a script RScripts/All_Final_Models.R to combine everyone's currently proposed models into a single place. I grouped them by seasonal vs seasonally adjusted data and created this table to show the model differences and relative performance. I also have the latex equivalent pasted below in case we want to put that into beamer (hopefully its compatible). I would still like to see us play with the additional variables a bit and see if we can find the appropriate lags to improve the models further since sarima allows you to easily include them.

Please take a look and let me know what you think. There are a few plots in the code which we can use for the presentation, but feel free to add more if you think we are missing something. We do probably need a few more. Data Model Order Seasonal.Order Xregs AIC BIC Unem Mdl.1 0,2,1 1,1,0 N -2.274336 -3.234984 Unem Mdl.2 0,2,1 3,1,0 N -2.435558 -3.369972 Unem Mdl.3 4,2,1 3,1,0 N -2.437146 -3.319090 Unem.sa Mdl.4 0,2,1 1,0,0 N -2.606460 -3.580226 Unem.sa Mdl.5 1,2,1 NA N -2.625197 -3.598962 Unem.sa Mdl.6 0,2,1 1,0,0 Y -2.576905 -3.485083 Unem.sa Mdl.7 1,2,1 NA Y -2.595392 -3.490453

% latex table generated in R 3.3.1 by xtable 1.8-2 package% Thu Jul 7 17:22:39 2016\begin{table}[ht]\centering\begin{tabular}{rlllllrr} \hline & Data & Model & Order & Seasonal.Order & Xregs & AIC & BIC \ \hline 1 & Unem & Mdl.1 & 0,2,1 & 1,1,0 & N & -2.27 & -3.23 \ 2 & Unem & Mdl.2 & 0,2,1 & 3,1,0 & N & -2.44 & -3.37 \ 3 & Unem & Mdl.3 & 4,2,1 & 3,1,0 & N & -2.44 & -3.32 \ 4 & Unem.sa & Mdl.4 & 0,2,1 & 1,0,0 & N & -2.61 & -3.58 \ 5 & Unem.sa & Mdl.5 & 1,2,1 & & N & -2.63 & -3.60 \ 6 & Unem.sa & Mdl.6 & 0,2,1 & 1,0,0 & Y & -2.58 & -3.49 \ 7 & Unem.sa & Mdl.7 & 1,2,1 & & Y & -2.60 & -3.49 \ \hline\end{tabular}\end{table}

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-231226942, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZx5EBdPuIic661NQ5uiIEUF5700Nks5qTX2mgaJpZM4Izj8R .

bopangpsy commented 8 years ago

This looks great! Thank you for putting this together!

On Thu, Jul 7, 2016 at 5:34 PM, pakarshan notifications@github.com wrote:

Looks great guys!

On Jul 7, 2016 17:28, "Joseph Blubaugh" notifications@github.com wrote:

I have created a script RScripts/All_Final_Models.R to combine everyone's currently proposed models into a single place. I grouped them by seasonal vs seasonally adjusted data and created this table to show the model differences and relative performance. I also have the latex equivalent pasted below in case we want to put that into beamer (hopefully its compatible). I would still like to see us play with the additional variables a bit and see if we can find the appropriate lags to improve the models further since sarima allows you to easily include them.

Please take a look and let me know what you think. There are a few plots in the code which we can use for the presentation, but feel free to add more if you think we are missing something. We do probably need a few more. Data Model Order Seasonal.Order Xregs AIC BIC Unem Mdl.1 0,2,1 1,1,0 N -2.274336 -3.234984 Unem Mdl.2 0,2,1 3,1,0 N -2.435558 -3.369972 Unem Mdl.3 4,2,1 3,1,0 N -2.437146 -3.319090 Unem.sa Mdl.4 0,2,1 1,0,0 N -2.606460 -3.580226 Unem.sa Mdl.5 1,2,1 NA N -2.625197 -3.598962 Unem.sa Mdl.6 0,2,1 1,0,0 Y -2.576905 -3.485083 Unem.sa Mdl.7 1,2,1 NA Y -2.595392 -3.490453

% latex table generated in R 3.3.1 by xtable 1.8-2 package% Thu Jul 7 17:22:39 2016\begin{table}[ht]\centering\begin{tabular}{rlllllrr} \hline & Data & Model & Order & Seasonal.Order & Xregs & AIC & BIC \ \hline 1 & Unem & Mdl.1 & 0,2,1 & 1,1,0 & N & -2.27 & -3.23 \ 2 & Unem & Mdl.2 & 0,2,1 & 3,1,0 & N & -2.44 & -3.37 \ 3 & Unem & Mdl.3 & 4,2,1 & 3,1,0 & N & -2.44 & -3.32 \ 4 & Unem.sa & Mdl.4 & 0,2,1 & 1,0,0 & N & -2.61 & -3.58 \ 5 & Unem.sa & Mdl.5 & 1,2,1 & & N & -2.63 & -3.60 \ 6 & Unem.sa & Mdl.6 & 0,2,1 & 1,0,0 & Y & -2.58 & -3.49 \ 7 & Unem.sa & Mdl.7 & 1,2,1 & & Y & -2.60 & -3.49 \ \hline\end{tabular}\end{table}

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-231226942 , or mute the thread < https://github.com/notifications/unsubscribe/AS9ZZx5EBdPuIic661NQ5uiIEUF5700Nks5qTX2mgaJpZM4Izj8R

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-231227924, or mute the thread https://github.com/notifications/unsubscribe/AKL-enymO0UKo9-YQAeF6A-1hvNqpr6Vks5qTX7mgaJpZM4Izj8R .

sheltonmath commented 8 years ago

I put everything done so far into a minimal pdf but with all 7 models. Please let me know what you want changed. https://www.overleaf.com/5646560qcxtqg

Here is a newer version: I like this layout better: https://www.overleaf.com/5654811dmsqbs

sheltonmath commented 8 years ago

I have been messing around with the conversations from these discussions and trying to put them into a document. I will put it into editable format for everyone else as soon as I can. (I have somewhere to be this evening or I would do it now.) In the meantime, you can send me things to add into the document in any format - word, email, text....

I don't mind compiling anything you want included.

main.article.pdf

JestonBlu commented 7 years ago

I have been working a bit more on fitting an arima model with regressors to the seasonally adjusted data. I believe I fixed the issue we were having with the xregs (they needed to be stationary as well). I also lagged the xregs based off of the cross correlation and lag plots and it looks like the model has improved from the AIC measure. It also looks like a few of the xregs are leading indicators of unemployment. The code is in RScripts/multivariate if you want to play with it. I think i will add this one to the All_Final_Models.r script soon if no one makes improvements on it.

multivariate_sarima

#### Model Comparison
## 
## Model 1: {AIC: -2.617} {BIC: -3.578} *** Best BIC
## Model 2: {AIC: -2.613} {BIC: -3.495}
## Model 3: {AIC: -2.672} {BIC: -3.565} *** Best AIC
##
#### Model 3 Pvalues
##
##                             Estimate     SE  t.value Pvalue
## ar1                          -0.2176 0.0672  -3.2387   .001 ***
## ma1                          -0.8835 0.0411 -21.4938  <.001 ***
## intercept                     0.0001 0.0009   0.1447   .886
## industrial_production_sa     -0.0500 0.0132  -3.7763  <.001 ***
## manufacturers_new_orders_sa  -0.0005 0.0007  -0.6516   .523
## house_price_sa               -0.0413 0.0122  -3.3765  <.001 ***
## construction_spend_sa         0.0120 0.0067   1.7902   .091 
## retail_sales_sa               0.0027 0.0013   2.1645   .044 ***

sheltonmath commented 7 years ago

I am working on citations in the writeup and will upload it soon. Do you want me to add this into this week's paper or keep it for next time?

On Wed, Jul 13, 2016 at 5:19 PM, Joseph Blubaugh notifications@github.com wrote:

I have been working a bit more on fitting an arima model with regressors to the seasonally adjusted data. I believe I fixed the issue we were having with the xregs (they needed to be stationary as well). I also lagged the xregs based off of the cross correlation and lag plots and it looks like the model has improved from the AIC measure. It also looks like a few of the xregs are leading indicators of unemployment. The code is in RScripts/multivariate if you want to play with it. I think i will add this one to the All_Final_Models.r script soon if no one makes improvements on it.

[image: multivariate_sarima] https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_16824060_514083d6-2D492e-2D11e6-2D8266-2D9714e22941b1.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=l5bmf4LBIMJu8ibndqxUkjOnfvDBNxrr921VTUsuqRA&e=

Model Comparison## ## Model 1: {AIC: -2.617} {BIC: -3.578} * Best BIC## Model 2: {AIC: -2.613} {BIC: -3.495}## Model 3: {AIC: -2.672} {BIC: -3.565} * Best AIC###### Model 3 Pvalues#### Estimate SE t.value Pvalue## ar1 -0.2176 0.0672 -3.2387 .001 _*## ma1 -0.8835 0.0411 -21.4938 <.001 ## intercept 0.0001 0.0009 0.1447 .886## industrial_production_sa -0.0500 0.0132 -3.7763 <.001 *## manufacturers_new_orders_sa -0.0005 0.0007 -0.6516 .523## house_price_sa -0.0413 0.0122 -3.3765 <.001 __## construction_spend_sa 0.0120 0.0067 1.7902 .091 ## retail_sales_sa 0.0027 0.0013 2.1645 .044 _*

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D232523792&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=kQVE-tbOaXVLzGYAx-ohfyIpgA7glZkwRIU9KiArQkg&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0zpDPo-2DAas9rjeNnaFdEZQLgNvMwks5qVYCXgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=2OXo-F1APtr7AVovOSSodXksxWTMO3WxNtv7ZYr4y3A&e= .

JestonBlu commented 7 years ago

If you have time let's replace the other seasonally adjusted models with xregs, otherwise we can wait. On Jul 13, 2016 7:46 PM, "Alison" notifications@github.com wrote:

I am working on citations in the writeup and will upload it soon. Do you want me to add this into this week's paper or keep it for next time?

On Wed, Jul 13, 2016 at 5:19 PM, Joseph Blubaugh <notifications@github.com

wrote:

I have been working a bit more on fitting an arima model with regressors to the seasonally adjusted data. I believe I fixed the issue we were having with the xregs (they needed to be stationary as well). I also lagged the xregs based off of the cross correlation and lag plots and it looks like the model has improved from the AIC measure. It also looks like a few of the xregs are leading indicators of unemployment. The code is in RScripts/multivariate if you want to play with it. I think i will add this one to the All_Final_Models.r script soon if no one makes improvements on it.

[image: multivariate_sarima] < https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_16824060_514083d6-2D492e-2D11e6-2D8266-2D9714e22941b1.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=l5bmf4LBIMJu8ibndqxUkjOnfvDBNxrr921VTUsuqRA&e=

Model Comparison## ## Model 1: {AIC: -2.617} {BIC: -3.578} *\ Best

BIC## Model 2: {AIC: -2.613} {BIC: -3.495}## Model 3: {AIC: -2.672} {BIC: -3.565} * Best AIC###### Model 3 Pvalues#### Estimate SE t.value Pvalue## ar1 -0.2176 0.0672 -3.2387 .001 _## ma1 -0.8835 0.0411 -21.4938 <.001 __## intercept 0.0001 0.0009 0.1447 .886## industrial_production_sa -0.0500 0.0132 -3.7763 <.001 __## manufacturers_new_orders_sa -0.0005 0.0007 -0.6516 .523## house_price_sa -0.0413 0.0122 -3.3765 <.001 __## construction_spend_sa 0.0120 0.0067 1.7902 .091 ## retail_sales_sa 0.0027 0.0013 2.1645 .044 _*

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D232523792&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=kQVE-tbOaXVLzGYAx-ohfyIpgA7glZkwRIU9KiArQkg&e= , or mute the thread < https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0zpDPo-2DAas9rjeNnaFdEZQLgNvMwks5qVYCXgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=2OXo-F1APtr7AVovOSSodXksxWTMO3WxNtv7ZYr4y3A&e=

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-232527354, or mute the thread https://github.com/notifications/unsubscribe/ADL2hcHjzdhiC43Sm99khUe10L4VTrePks5qVYb6gaJpZM4Izj8R .

sheltonmath commented 7 years ago

I can make it happen.

On Wed, Jul 13, 2016 at 5:55 PM, Joseph Blubaugh notifications@github.com wrote:

If you have time let's replace the other seasonally adjusted models with xregs, otherwise we can wait. On Jul 13, 2016 7:46 PM, "Alison" notifications@github.com wrote:

I am working on citations in the writeup and will upload it soon. Do you want me to add this into this week's paper or keep it for next time?

On Wed, Jul 13, 2016 at 5:19 PM, Joseph Blubaugh < notifications@github.com

wrote:

I have been working a bit more on fitting an arima model with regressors to the seasonally adjusted data. I believe I fixed the issue we were having with the xregs (they needed to be stationary as well). I also lagged the xregs based off of the cross correlation and lag plots and it looks like the model has improved from the AIC measure. It also looks like a few of the xregs are leading indicators of unemployment. The code is in RScripts/multivariate if you want to play with it. I think i will add this one to the All_Final_Models.r script soon if no one makes improvements on it.

[image: multivariate_sarima] <

https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_16824060_514083d6-2D492e-2D11e6-2D8266-2D9714e22941b1.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=l5bmf4LBIMJu8ibndqxUkjOnfvDBNxrr921VTUsuqRA&e=

Model Comparison## ## Model 1: {AIC: -2.617} {BIC: -3.578} ***

Best BIC## Model 2: {AIC: -2.613} {BIC: -3.495}## Model 3: {AIC: -2.672} {BIC: -3.565} * Best AIC###### Model 3 Pvalues#### Estimate SE t.value Pvalue## ar1 -0.2176 0.0672 -3.2387 .001 _## ma1 -0.8835 0.0411 -21.4938 <.001 __## intercept 0.0001 0.0009 0.1447 .886## industrial_production_sa -0.0500 0.0132 -3.7763 <.001 __## manufacturers_new_orders_sa -0.0005 0.0007 -0.6516 .523## house_price_sa -0.0413 0.0122 -3.3765 <.001 __## construction_spend_sa 0.0120 0.0067 1.7902 .091 ## retail_sales_sa 0.0027 0.0013 2.1645 .044 _*

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D232523792&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=kQVE-tbOaXVLzGYAx-ohfyIpgA7glZkwRIU9KiArQkg&e=

, or mute the thread <

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0zpDPo-2DAas9rjeNnaFdEZQLgNvMwks5qVYCXgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=2OXo-F1APtr7AVovOSSodXksxWTMO3WxNtv7ZYr4y3A&e=

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-232527354 , or mute the thread < https://github.com/notifications/unsubscribe/ADL2hcHjzdhiC43Sm99khUe10L4VTrePks5qVYb6gaJpZM4Izj8R

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D232528574&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=qjFU5D2zc-CkZxVm6z8lF0Fx3Rt5_SmfcxPV2vh47w8&s=Op-0Rw_TQUMGNkTRkoHlR9egqG610MguvoxlMmB7z50&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK05O1E-2DESU7g7vh3PJg-2DTKkjxVtD8ks5qVYkcgaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=qjFU5D2zc-CkZxVm6z8lF0Fx3Rt5_SmfcxPV2vh47w8&s=fiVDvXLDefEf5hn6biM_LGpZfnNASJMC6G_m7tZWkJg&e= .

sheltonmath commented 7 years ago

I emailed what I have so far to everyone. I will proofread in the morning. If all is good I will send this one in and keep revising for the next round. Group4Presentation2.pdf

pakarshan commented 7 years ago

For the xregs, I already used the seasonally adjusted stationary xregs. On Jul 14, 2016 1:02 AM, "Alison" notifications@github.com wrote:

Reopened #3 https://github.com/JestonBlu/STAT626_PROJECT/issues/3.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#event-722539509, or mute the thread https://github.com/notifications/unsubscribe/AS9ZZ0WZFK71k6zAqR5Wv4S_pT6zBGbhks5qVdERgaJpZM4Izj8R .

JestonBlu commented 7 years ago

Right, but what I just proposed was lagging the xregs which you did not do. Also we had some differences in our differencing and model parameters choices. Our model diagnostics are a different as well.. it looks like a lot of the pvalues in your Ljung-Box statistic were significant suggesting error dependence.

trlilley12 commented 7 years ago

Of the models we have discussed so far, I think the ARIMA(1, 2, 1) is best. It had the best diagnostics and the lowest AIC.

I added some predictors to the ARIMA(1, 2, 1), and only retail seemed significant. However, its coefficient is so small that I argue we don't need it.

I then did some forecasting for the ARIMA(1, 2, 1) as well as two ARIMA(1, 2, 1) models with predictors. I then compared our predicted values for 2016 unemployment with the actual values:

Jan 2016: actual 5.3 , predicted = 5.0 Feb 2016: actual 5.2 , predicted = 5.0
Mar 2016: actual 5.1 , predicted = 4.9 Apr 2016: actual 4.7 , predicted = 4.9 May 2016: actual 4.5 , predicted = 4.9

Overall, I think the ARIMA(1, 2, 1) is very good.

I uploaded all of my code as "forecasting 7_21_16".

JestonBlu commented 7 years ago

@trlilley12 did you see the model I posted that was also an ARIMA(1,2,1)? I also added some xregs with different lags and in addition to retail, industrial production, and house price measure as significant. The script is in Rscripts/multivariate.R.

trlilley12 commented 7 years ago

Oh, okay that looks like it lowers the AIC. Did you try the ARIMA(1, 2, 1) with different lags for retail, ipi, and house price (excluding the others)? The AIC might be even lower.

If your model is better, we can definitely go with it.

JestonBlu commented 7 years ago

@trlilley12 yeah, I experimented with different lags and that seemed to be what worked best... when they were all the same lag it didn't look as good.

trlilley12 commented 7 years ago

Okay, I prefer the simpler ARIMA(1, 2, 1) with no predictors, since Dr. P prefers simpler models. It had the lowest BIC as well. Can everyone vote on it?

sheltonmath commented 7 years ago

Sounds good to me!

On Thu, Jul 21, 2016 at 11:41 AM, trlilley12 notifications@github.com wrote:

Okay, I prefer the simpler ARIMA(1, 2, 1) with no predictors, since Dr. P prefers simpler models. It had the lowest BIC as well. Can everyone vote on it?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234345326&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=sbCx5rlGcLxlQopSkpF7P5iKRnjcs6iW3Kl2QvZ0k0A&s=G-eSEGeB6v2QBH3lzjYUvX0SvSkzVdoFb_r47TtPFls&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK05VQNTLMhRaCUAAzgG5bWGWU5YSTks5qX71DgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=sbCx5rlGcLxlQopSkpF7P5iKRnjcs6iW3Kl2QvZ0k0A&s=pIYAQrvu3guFO7QIVmRo2NHGq3AogFdOqOVeD5si2iA&e= .

JestonBlu commented 7 years ago

@trlilley12 would you mind adding the AIC/BIC results to your model script as comments? Im going to add this to the all_final_models.r script and I want to make sure the diagnostics match up when I run it.

JestonBlu / Unemployment

Modeling #3

Model Comparison## ## Model 1: {AIC: -2.617} {BIC: -3.578} *\ Best

Model Comparison## ## Model 1: {AIC: -2.617} {BIC: -3.578} ***