JestonBlu / Unemployment

Masters Project: Forecasting Unemployment
0 stars 4 forks source link

Modeling #3

Closed JestonBlu closed 7 years ago

JestonBlu commented 8 years ago

Use this thread to discuss modeling and forecasting

bopangpsy commented 8 years ago

ARIMA(1,2,1) looks good to me. Many people actually prefer BIC over AIC.

Btw, do we need to look around for other candidate models? I plan to do it tomorrow night since I have other final on tomorrow afternoon. Sorry being late on this issue.

On Thu, Jul 21, 2016 at 2:23 PM, Joseph Blubaugh notifications@github.com wrote:

@trlilley12 https://github.com/trlilley12 would you mind adding the AIC/BIC results to your model script as comments? Im going to add this to the all_final_models.r script and I want to make sure the diagnostics match up when I run it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234356823, or mute the thread https://github.com/notifications/unsubscribe-auth/AKL-ejQ1j4FDsqFqMdQqCXe8smcFys2cks5qX8c1gaJpZM4Izj8R .

JestonBlu commented 8 years ago

Yes, I think we should look at one or two models outside of the current ARIMA set... maybe VAR or Fractional ARIMA. Also @trlilley12 I am unable to run your code without it erroring out so I cannot verify your results..

bopangpsy commented 8 years ago

Sounds good. I'll try to develop a few other models.

On Thu, Jul 21, 2016 at 2:39 PM, Joseph Blubaugh notifications@github.com wrote:

Yes, I think we should look at one or two models outside of the current ARIMA set... maybe VAR or Fractional ARIMA. Also @trlilley12 https://github.com/trlilley12 I am unable to run your code without it erroring out so I cannot verify your results..

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234360871, or mute the thread https://github.com/notifications/unsubscribe-auth/AKL-eqrcm5HjfWTyS-ZYJIkvxXsHfyp9ks5qX8rdgaJpZM4Izj8R .

JestonBlu commented 8 years ago

Would everyone be in favor of ditching the seasonally unadjusted data/models in our to help reduce the number of models we talk about in the writeup? I think it may be a little confusing going through same steps for the 2 series. So far from what everyone has posted, the seaonally adjusted data seems to be performing better than the unadjusted. I vote that we focus our attention on the seasonally adjusted data to try and consolidate all of our iterations. Any thoughts?

SZRoberson commented 8 years ago

I second this motion. We shouod keep the best models.

On Jul 21, 2016 3:05 PM, "Joseph Blubaugh" notifications@github.com wrote:

Would everyone be in favor of ditching the seasonally unadjusted data/models in our to help reduce the number of models we talk about in the writeup? I think it may be a little confusing going through same steps for the 2 series. So far from what everyone has posted, the seaonally adjusted data seems to be performing better than the unadjusted. I vote that we focus our attention on the seasonally adjusted data to try and consolidate all of our iterations. Any thoughts?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234367536, or mute the thread https://github.com/notifications/unsubscribe-auth/AS8he0ecxEO-UPHpBPC0sdiBuPlSu0Z3ks5qX9DxgaJpZM4Izj8R .

sheltonmath commented 8 years ago

I agree. Thank you.

On Thu, Jul 21, 2016 at 1:09 PM, Sean Roberson notifications@github.com wrote:

I second this motion. We shouod keep the best models.

On Jul 21, 2016 3:05 PM, "Joseph Blubaugh" notifications@github.com wrote:

Would everyone be in favor of ditching the seasonally unadjusted data/models in our to help reduce the number of models we talk about in the writeup? I think it may be a little confusing going through same steps for the 2 series. So far from what everyone has posted, the seaonally adjusted data seems to be performing better than the unadjusted. I vote that we focus our attention on the seasonally adjusted data to try and consolidate all of our iterations. Any thoughts?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234367536 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AS8he0ecxEO-UPHpBPC0sdiBuPlSu0Z3ks5qX9DxgaJpZM4Izj8R

.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234368531&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=NMxZKj4wN5TZrfInwNsYqWehlpIbO5fwPoMNEwOuOgs&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK08TPpaeMZubwm41YYIle4357LzrNks5qX9HdgaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=0uwFMN_BoCNGumb49d9L689rOkSjj0ETEOPL2TBQC6A&e= .

bopangpsy commented 8 years ago

Sounds good to me

On Jul 21, 2016, at 3:14 PM, Alison notifications@github.com wrote:

I agree. Thank you.

On Thu, Jul 21, 2016 at 1:09 PM, Sean Roberson notifications@github.com wrote:

I second this motion. We shouod keep the best models.

On Jul 21, 2016 3:05 PM, "Joseph Blubaugh" notifications@github.com wrote:

Would everyone be in favor of ditching the seasonally unadjusted data/models in our to help reduce the number of models we talk about in the writeup? I think it may be a little confusing going through same steps for the 2 series. So far from what everyone has posted, the seaonally adjusted data seems to be performing better than the unadjusted. I vote that we focus our attention on the seasonally adjusted data to try and consolidate all of our iterations. Any thoughts?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234367536 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AS8he0ecxEO-UPHpBPC0sdiBuPlSu0Z3ks5qX9DxgaJpZM4Izj8R

.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234368531&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=NMxZKj4wN5TZrfInwNsYqWehlpIbO5fwPoMNEwOuOgs&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK08TPpaeMZubwm41YYIle4357LzrNks5qX9HdgaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=0uwFMN_BoCNGumb49d9L689rOkSjj0ETEOPL2TBQC6A&e= .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

pakarshan commented 8 years ago

I am good too.

On Jul 21, 2016 4:02 PM, "bopangpsy" notifications@github.com wrote:

Sounds good to me

On Jul 21, 2016, at 3:14 PM, Alison notifications@github.com wrote:

I agree. Thank you.

On Thu, Jul 21, 2016 at 1:09 PM, Sean Roberson <notifications@github.com

wrote:

I second this motion. We shouod keep the best models.

On Jul 21, 2016 3:05 PM, "Joseph Blubaugh" notifications@github.com wrote:

Would everyone be in favor of ditching the seasonally unadjusted data/models in our to help reduce the number of models we talk about in the writeup? I think it may be a little confusing going through same steps for the 2 series. So far from what everyone has posted, the seaonally adjusted data seems to be performing better than the unadjusted. I vote that we focus our attention on the seasonally adjusted data to try and consolidate all of our iterations. Any thoughts?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234367536

, or mute the thread <

https://github.com/notifications/unsubscribe-auth/AS8he0ecxEO-UPHpBPC0sdiBuPlSu0Z3ks5qX9DxgaJpZM4Izj8R

.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub < https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234368531&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=NMxZKj4wN5TZrfInwNsYqWehlpIbO5fwPoMNEwOuOgs&e= , or mute the thread < https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK08TPpaeMZubwm41YYIle4357LzrNks5qX9HdgaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=0uwFMN_BoCNGumb49d9L689rOkSjj0ETEOPL2TBQC6A&e=

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234383166, or mute the thread https://github.com/notifications/unsubscribe-auth/AS9ZZ9AXpi1zXXG1IFsY_WLWkP0Y1ZJEks5qX95xgaJpZM4Izj8R .

trlilley12 commented 8 years ago

I am glad to start doing some forecasting. I did some with the ARIMA(1, 2, 1) seasonally adjusted, no predictors. It's in the RScript "forecasting."

What other potential models are we considering? My only concern is that if we choose a model with predictors, we will have to forecast those predictors before we forecast the unemployment rate.

sheltonmath commented 8 years ago

I think that we were focusing on the model without predictors, so you are great.

On Fri, Jul 22, 2016 at 6:09 AM, trlilley12 notifications@github.com wrote:

I am glad to start doing some forecasting. I did some with the ARIMA(1, 2, 1) seasonally adjusted, no predictors. It's in the RScript "forecasting."

What other potential models are we considering? My only concern is that if we choose a model with predictors, we will have to forecast those predictors before we forecast the unemployment rate.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234539364&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=DedT2-6wnpUm14UOx0Ba38i5jjAiGjinFsgi0CSWRdQ&s=72lMWQ5jswtbs-cHTEUnIDPC2Zw4sWNkfO3WlcSmtj8&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK06RN5JAIBqjVsSxj6LafZ1G6xQ87ks5qYMEkgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=DedT2-6wnpUm14UOx0Ba38i5jjAiGjinFsgi0CSWRdQ&s=TRJZ2k3uW8fWOFBuJZ3KlvitT0CArjuOLlNF4JAZL4E&e= .

sheltonmath commented 8 years ago

By the way, the comments in the scripts are very helpful.

trlilley12 commented 8 years ago

In case we go with the ARIMA(1, 2, 1) model for the seasonally adjusted data with no predictors, here are some forecast plots. I uploaded them in the Plots folder, too.

The graphs are for the h = 5, 12, and 24 step ahead forecasts. The first three were generated by sarima( ), and the last three by Arima( ). Personally, I think the last three look better. I think it's good to have a picture of the forecast in the context of all the data. I will play around with sarima( ) to see if I can adjust the default axes to accommodate all past data.

sarima h 5 sarima h 12 sarima h 24 arima h 5 arima h 12 arima h 24

trlilley12 commented 8 years ago

And here is a plot of the first five forecasted values (red) along with the actual observed values (black) from 2016.

sarima h 5 predicted and actual values

trlilley12 commented 8 years ago

I looked at the FRED website where we got our data, and it looks like the unemployment for June 2016 has been posted at 5.1%. We could compare that to our predictor for June 2016 as well.

Here is a plot from Arima( ) that shows the predicted values through June 2016 (blue) and the observed values (black).

I put all the code for my plots in the RSCript folder and named it "forecasting plots."

arima h 6 predicted and actual

sheltonmath commented 8 years ago

Very nice

Get Outlook for iOS

On Fri, Jul 22, 2016 at 11:19 AM -0700, "trlilley12" notifications@github.com wrote:

I looked at the FRED website where we got our data, and it looks like the unemployment for June 2016 has been posted at 5.1%. We could compare that to our predictor for June 2016 as well.

Here is a plot from Arima( ) that shows the predicted values through June 2016 (blue) and the observed values (black).

I put all the code for my plots in the RSCript folder and named it "forecasting plots."

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or mute the thread.

pakarshan commented 7 years ago

For models without predictors, I had also attached a word document containing the forecasts. I will look into a couple more.

On Jul 22, 2016 2:40 PM, "Alison" notifications@github.com wrote:

Very nice

Get Outlook for iOS

On Fri, Jul 22, 2016 at 11:19 AM -0700, "trlilley12" < notifications@github.com> wrote:

I looked at the FRED website where we got our data, and it looks like the unemployment for June 2016 has been posted at 5.1%. We could compare that to our predictor for June 2016 as well.

Here is a plot from Arima( ) that shows the predicted values through June 2016 (blue) and the observed values (black).

I put all the code for my plots in the RSCript folder and named it "forecasting plots."

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234637110, or mute the thread https://github.com/notifications/unsubscribe-auth/AS9ZZ_T3NtyL99AdyCipoTd5CcEsgMOnks5qYRy0gaJpZM4Izj8R .

sheltonmath commented 7 years ago

I added word documents into a new folder with model output. Please let me know your opinions on the final models so I can get them into the write-up.

Also, there is a lot in the literature about VAR models so I like the idea of comparing the VAR models to the ARIMA ones.

Also, a lot of the literature discusses using new unemployment claims as a predictor for unemployment rate, but I think that may be a little too close to the actual data. What do you think?

What about the presidential cycles. Do we want to look at that too? How unemployment has risen and dropped. The literature also mentions that there tend to be cycles where the unemployment rises sharply and then recovers slowly. Visually that seems to correspond with presidential changes. Do we want to see if we can model that mathematically or is that too much?

JestonBlu commented 7 years ago

I have built a few VAR models that we can use to compare against the currently favored ARIMA models. I have also cleaned up the All_Final_Models.r script and removed all of the seasonally adjusted data and models. I will post about that next, but here is what I have found for the VAR model. First, i think it was very fun to play with the vars package. It has a lot of functionality and many different plots that can be called.

I ended up fitting 6 models in total. VAR(1), VAR(2), VAR(3) with no lags and then again with all of the "xRegs" lagged at various h (see Multivariate.r) for how i determined which lags to use. There is a lot of output that comes with each model so I am only going to post one so you get the idea. You should be able to run the VAR.r script without incident if the data folder is a sub directory of your current R work space.

I decided to run up to a VAR(3) so that I could try to eliminate as much residual variance as possible. Sometimes in the ACF residuals plots you can see significant values in lag 12 even though we are using seasonally adjusted data. You dont see this in the unemployment rate acf plots which is good since thats what we are most interested in. You could probably argue that VAR(1) is good enough if you only wanted to look at unemployment.

var_unem_resid

Here is a plot of the unemployment series in the best performing model by AIC: Var(2) with lagged xregs.

var2_lag_diag

There is also forecasting functionality in the package which is nice because in the case of an ARIMA model with xregs, you dont have to forecast the xregs. Vars will do that for you since all of they are essentially AR(p) models that only use lagged values to forecast.

var2_fcst

bopangpsy commented 7 years ago

I also built a few VAR models. By VARselect, BIC suggests VAR(1) HQ suggest VAR(2). The VAR(1) results only show the retail_sales_sa.l1 and recession_ind.l1 besides unem_rate_sa.l1 were significant predictors. I checked the correlation among these predictors and found that variables industrial_production, manufacturers_new_orders, house_price_sa, construction_spend, and retail_sales are highly correlated. image

It might be reasonable to leave out some highly correlated variables. Thus, I then fitted two models with only unem_rate, retail_sales, and recession_ind. Here are the AICs and BICs.

AIC(M1$varresult$unem_rate_sa) # -253.317 AIC(M2$varresult$unem_rate_sa) # -252.6457 AIC(M3$varresult$unem_rate_sa) # -247.1147 AIC(M4$varresult$unem_rate_sa) # -251.6351

BIC(M1$varresult$unem_rate_sa) # -217.1493 BIC(M2$varresult$unem_rate_sa) # -191.2225 BIC(M3$varresult$unem_rate_sa) # -225.414 BIC(M4$varresult$unem_rate_sa) # -219.117

AICs suggest the original VAR(1) model. The BICs suggest the VAR(1) with only three variables.

image The CCF plots look pretty reasonable. The ACF for retail sales and recession ind signify some issues. Due to the ACF problems, expectedly the formal is rejected.

I noticed the models that @JestonBlu built did not include the recession index. It is a dummy variable. So @JestonBlu may be right. It is not appropriate to put it into a VAR model But including the variable improves the model fitting quite a bit. Any suggestion on this point?

bopangpsy commented 7 years ago

I put the code I worked with under RScripts/var_2.

JestonBlu commented 7 years ago

Yeah, i am not sure how appropriate it is to include the recession indicator, but that is very interesting that it improved AIC that much. I will add it to my version as well since I am probably using different lags for all of the variables... we will see how it shakes out.. either way I will add what you have done to the All_Final_Moels.r and then we can decide as a group which to mention in the write up. Im finalizing some tables right now that compares all of the best performing models everyone has submitted.. i will post the results for discussion shortly.

JestonBlu commented 7 years ago

One point though that I read about... since VARs do not require data to be stationary maybe it is okay to include it... has anyone come across anything in the literature that might have looked at this?

bopangpsy commented 7 years ago

Thanks! This issue might need some discussion. Btw, I actually prefer the model 3 among the set I proposed. It has the smallest BIC and really simple (two leading variables and 1 lag).

I also saw some problems of the acf plots. I tried to fit stationary data by differencing. But that didn't help much and ruined model fitting in terms AIC and BIC. Any suggestions to further explore on this issue would be appreciated.

On Sat, Jul 23, 2016 at 12:29 PM, Joseph Blubaugh notifications@github.com wrote:

One point though that I read about... since VARs do not require data to be stationary maybe it is okay to include it... has anyone come across anything in the literature that might have looked at this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234730068, or mute the thread https://github.com/notifications/unsubscribe-auth/AKL-euVgpzUvVCQdSDnWh4Kv_E9zvud7ks5qYk9igaJpZM4Izj8R .

JestonBlu commented 7 years ago

Okay, I have compiled all of the models we have considered into the All_Final_Models.r script... so far we have 2 model types ARIMA and VAR. I do not think we should actually talk about or show diagnostic plots on all of these models. Maybe just focus on the top 2 in the 3rd table, but I do think we should perhaps show tables of all of the models we considered.

Model Comparisons

Comparing ARIMA Models

Model Order Xregs Lag.Xregs AIC BIC Best
Mdl.1 1,2,1 -212.2957 -201.4563 Best BIC
Mdl.2 2,2,2 -211.8094 -193.7438
Mdl.3 3,2,3 -215.4772 -190.1853
Mdl.4 1,2,1 Y -211.5564 -182.6514
Mdl.5 2,2,2 Y -209.8342 -177.3160
Mdl.6 3,2,3 Y -215.0983 -171.7408
Mdl.7 1,2,1 Y -222.4520 -193.6943 Best AIC
Mdl.8 2,2,2 Y -220.7001 -188.3477
Mdl.9 3,2,3 Y -217.8920 -174.7555

Comparing VAR Models

Model P Lag.Xregs Recession.Ind AIC BIC Best
Mdl.1 1 -226.3472 -193.7962
Mdl.2 2 -219.2293 -165.0324
Mdl.3 1 Y -253.3170 -217.1493 Best BIC/AIC
Mdl.4 1 Y -220.6678 -188.2820
Mdl.5 2 Y -235.4387 -181.5180
Mdl.6 1 Y Y -243.8699 -207.8857

Best Models from both model sets

Model Lag.XRegs Reccession AIC BIC Best
ARIMA(1,2,1) -212.29 -201.45
ARIMA(1,2,1) Y -222.45 -193.69
VAR(1) Y -253.31 -217.15 Best AIC/BIC

Forecast Plots

The code for the plot are also saved in the All Final_Models.r script.

5 Month Forecasts for the 2 best Models

Since we decomposed and adjusted the seasonal data ourselves, it differs slightly from what you would see on the BLS website so I applied the same seasonal adjustment to the first 5 months of unemployment that came with the original data set. Overall the two plots are very similar.

forecast_arima_var

It also looks like the VAR model produced a slightly better forecast over this period, however the confidence intervals of the models overlap substantially.

forecast_arima_var_together

The forecasts start to look significantly different when you look at the longer term forecasts. This plot shows a 36 month forecast for the two best models. We can see how the confidence interval of the ARIMA model quickly explodes, perhaps indicating that it is not a good choice for long term forecasts.

forecast_longterm2

Note on best VAR For the best VAR model shown in the 2nd table, all of the variables are present. The inclusion of the recession indicator significantly improves the overall fit as well as the look of the forecast plot. There are a few variables in the VAR model that do not measure as being significant. When taking those parameters out the longterm forecast looks a bit more aggressive. The AIC and BIC are both a couple points improved if you remove the insignificant variables though. I can strip them back out depending on what everyone thinks we should do. Here is a plot with the insignificant variables removed.

forecast_longterm

As far as model choice goes, i tend to favor the VAR rather than the ARIMA based on the model fit and forecast plots. The ARIMA(1,2,1) has 2 parameters, and the VAR(1) has 9 parameters (7 if we remove the insignificant variables). The inclusion of the recession indicator really helps the fit. So far I have not seen anything online that says its inappropriate to use an indicator variable in a VAR model.

Please everyone weigh in on the model selection. If we elect not to use recession indicator then on the second table, mdl.1 is the best BIC and model 5 is the best AIC. If we only use the significant variables then the mdl.1 VAR(1) becomes the best model with an AIC of -225 and BIC of -200 which is right there with the ARIMA(1,2,1) and it would have 6 parameters.

Latex of generated tables

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sat Jul 23 15:13:07 2016
\begin{table}[ht]
\centering
\begin{tabular}{rllllrrl}
  \hline
 & Model & Order & Xregs & Lag.Xregs & AIC & BIC & Best \\ 
  \hline
1 & Mdl.1 & 1,2,1 &  &  & -212.30 & -201.46 & Best BIC \\ 
  2 & Mdl.2 & 2,2,2 &  &  & -211.81 & -193.74 &  \\ 
  3 & Mdl.3 & 3,2,3 &  &  & -215.48 & -190.19 &  \\ 
  4 & Mdl.4 & 1,2,1 & Y &  & -211.56 & -182.65 &  \\ 
  5 & Mdl.5 & 2,2,2 & Y &  & -209.83 & -177.32 &  \\ 
  6 & Mdl.6 & 3,2,3 & Y &  & -215.10 & -171.74 &  \\ 
  7 & Mdl.7 & 1,2,1 &  & Y & -222.45 & -193.69 & Best AIC \\ 
  8 & Mdl.8 & 2,2,2 &  & Y & -220.70 & -188.35 &  \\ 
  9 & Mdl.9 & 3,2,3 &  & Y & -217.89 & -174.76 &  \\ 
   \hline
\end{tabular}
\end{table}

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sat Jul 23 15:13:07 2016
\begin{table}[ht]
\centering
\begin{tabular}{rllllrrl}
  \hline
 & Model & P & Lag.Xregs & Recession.Ind & AIC & BIC & Best \\ 
  \hline
1 & Mdl.1 & 1 &  &  & -226.35 & -193.80 &  \\ 
  2 & Mdl.2 & 2 &  &  & -219.23 & -165.03 &  \\ 
  3 & Mdl.3 & 1 &  & Y & -253.32 & -217.15 & Best BIC/AIC \\ 
  4 & Mdl.4 & 1 & Y &  & -220.67 & -188.28 &  \\ 
  5 & Mdl.5 & 2 & Y &  & -235.44 & -181.52 &  \\ 
  6 & Mdl.6 & 1 & Y & Y & -243.87 & -207.89 &  \\ 
   \hline
\end{tabular}
\end{table}

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sat Jul 23 15:13:07 2016
\begin{table}[ht]
\centering
\begin{tabular}{rlllrrl}
  \hline
 & Model & Lag.XRegs & Reccession & AIC & BIC & Best \\ 
  \hline
1 & ARIMA(1,2,1) &  &  & -212.29 & -201.45 &  \\ 
  2 & ARIMA(1,2,1) & Y &  & -222.45 & -193.69 &  \\ 
  3 & VAR(1) &  & Y & -253.31 & -217.15 & Best AIC/BIC \\ 
   \hline
\end{tabular}
\end{table}
sheltonmath commented 7 years ago

This is very nice. I like the recession indicator. I think it is consistent with the literature. It is a way of dealing with the fact that we would expect unemployment to increase more rapidly during a recession than at other times.

From: (Montgomery et al., 1998)

"Evidently the unemployment rate has a strong tendency to move countercyclically, upward in general business slowdowns and contractions and downward in speedups and expansions.

...univariate linear models are not able to accurately represent these asymmetric cycles.

...the contraction phases in the U.S. economy tend to be shorter than the expansion phases.

It should also be noted that forecasting unemployment is much more difficult during periods when it is rapidly increasing than during more stable periods."

On Sat, Jul 23, 2016 at 1:23 PM, Joseph Blubaugh notifications@github.com wrote:

Okay, I have compiled all of the models we have considered into the All_Final_Models.r script... so far we have 2 model types ARIMA and VAR. I do not think we should actually talk about or show diagnostic plots on all of these models. Maybe just focus on the top 2 in the 3rd table, but I do think we should perhaps show tables of all of the models we considered. Model Comparisons

Comparing ARIMA Models Model Order Xregs Lag.Xregs AIC BIC Best Mdl.1 1,2,1 -212.2957 -201.4563 Best BIC Mdl.2 2,2,2 -211.8094 -193.7438 Mdl.3 3,2,3 -215.4772 -190.1853 Mdl.4 1,2,1 Y -211.5564 -182.6514 Mdl.5 2,2,2 Y -209.8342 -177.3160 Mdl.6 3,2,3 Y -215.0983 -171.7408 Mdl.7 1,2,1 Y -222.4520 -193.6943 Best AIC Mdl.8 2,2,2 Y -220.7001 -188.3477 Mdl.9 3,2,3 Y -217.8920 -174.7555

Comparing VAR Models Model P Lag.Xregs Recession.Ind AIC BIC Best Mdl.1 1 -226.3472 -193.7962 Mdl.2 2 -219.2293 -165.0324 Mdl.3 1 Y -253.3170 -217.1493 Best BIC/AIC Mdl.4 1 Y -220.6678 -188.2820 Mdl.5 2 Y -235.4387 -181.5180 Mdl.6 1 Y Y -243.8699 -207.8857

Best Models from both model sets Model Lag.XRegs Reccession AIC BIC Best ARIMA(1,2,1) -212.29 -201.45 ARIMA(1,2,1) Y -222.45 -193.69 VAR(1) Y -253.31 -217.15 Best AIC/BIC Forecast Plots

The code for the plot are also saved in the All Final_Models.r script.

5 Month Forecasts for the 2 best Models

Since we decomposed and adjusted the seasonal data ourselves, it differs slightly from what you would see on the BLS website so I applied the same seasonal adjustment to the first 5 months of unemployment that came with the original data set. Overall the two plots are very similar.

[image: forecast_arima_var] https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_17079989_7f50eb7a-2D50e6-2D11e6-2D86f9-2D2be94512cf50.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=0XTm_2Te9TUnhhb5fHSjFAqJsPjtKoTu5iifFcyDd-g&e=

It also looks like the VAR model produced a slightly better forecast over this period, however the confidence intervals of the models overlap substantially.

[image: forecast_arima_var_together] https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_17079990_842d8874-2D50e6-2D11e6-2D99e7-2Db5836976f427.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=KrUR8pLYwo6g9POhxKFE7Qv4exnzTK0HcF90NqpPvXE&e=

The forecasts start to look significantly different when you look at the longer term forecasts. This plot shows a 36 month forecast for the two best models. We can see how the confidence interval of the ARIMA model quickly explodes, perhaps indicating that it is not a good choice for long term forecasts.

[image: forecast_longterm2] https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_17080016_121c769a-2D50e7-2D11e6-2D9e43-2D4c45cccc7cd7.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=B1l5nLrzVoIC3J49EAm5xqRaRMHOC-NrLepiW7SREVA&e=

Note on best VAR For the best VAR model shown in the 2nd table, all of the variables are present. The inclusion of the recession indicator significantly improves the overall fit as well as the look of the forecast plot. There are a few variables in the VAR model that do not measure as being significant. When taking those parameters out the longterm forecast looks a bit more aggressive. The AIC and BIC are both a couple points improved if you remove the insignificant variables though. I can strip them back out depending on what everyone thinks we should do. Here is a plot with the insignificant variables removed.

[image: forecast_longterm] https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_17079992_8931df00-2D50e6-2D11e6-2D99e4-2Db19ea9d08ffa.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=5SsLViEJ5VCZo8G5GWo2fDCpYpTEg_XAi8B_NkZGz7Q&e=

As far as model choice goes, i tend to favor the VAR rather than the ARIMA based on the model fit and forecast plots. The ARIMA(1,2,1) has 2 parameters, and the VAR(1) has 9 parameters (7 if we remove the insignificant variables). The inclusion of the recession indicator really helps the fit. So far I have not seen anything online that says its inappropriate to use an indicator variable in a VAR model.

Please everyone weigh in on the model selection. If we elect not to use recession indicator then on the second table, mdl.1 is the best BIC and model 5 is the best AIC. If we only use the significant variables then the mdl.1 VAR(1) becomes the best model with an AIC of -225 and BIC of -200 which is right there with the ARIMA(1,2,1) and it would have 6 parameters.

Latex of generated tables

% latex table generated in R 3.3.1 by xtable 1.8-2 package% Sat Jul 23 15:13:07 2016\begin{table}[ht]\centering\begin{tabular}{rllllrrl} \hline & Model & Order & Xregs & Lag.Xregs & AIC & BIC & Best \ \hline 1 & Mdl.1 & 1,2,1 & & & -212.30 & -201.46 & Best BIC \ 2 & Mdl.2 & 2,2,2 & & & -211.81 & -193.74 & \ 3 & Mdl.3 & 3,2,3 & & & -215.48 & -190.19 & \ 4 & Mdl.4 & 1,2,1 & Y & & -211.56 & -182.65 & \ 5 & Mdl.5 & 2,2,2 & Y & & -209.83 & -177.32 & \ 6 & Mdl.6 & 3,2,3 & Y & & -215.10 & -171.74 & \ 7 & Mdl.7 & 1,2,1 & & Y & -222.45 & -193.69 & Best AIC \ 8 & Mdl.8 & 2,2,2 & & Y & -220.70 & -188.35 & \ 9 & Mdl.9 & 3,2,3 & & Y & -217.89 & -174.76 & \ \hline\end{tabular}\end{table} % latex table generated in R 3.3.1 by xtable 1.8-2 package% Sat Jul 23 15:13:07 2016\begin{table}[ht]\centering\begin{tabular}{rllllrrl} \hline & Model & P & Lag.Xregs & Recession.Ind & AIC & BIC & Best \ \hline 1 & Mdl.1 & 1 & & & -226.35 & -193.80 & \ 2 & Mdl.2 & 2 & & & -219.23 & -165.03 & \ 3 & Mdl.3 & 1 & & Y & -253.32 & -217.15 & Best BIC/AIC \ 4 & Mdl.4 & 1 & Y & & -220.67 & -188.28 & \ 5 & Mdl.5 & 2 & Y & & -235.44 & -181.52 & \ 6 & Mdl.6 & 1 & Y & Y & -243.87 & -207.89 & \ \hline\end{tabular}\end{table} % latex table generated in R 3.3.1 by xtable 1.8-2 package% Sat Jul 23 15:13:07 2016\begin{table}[ht]\centering\begin{tabular}{rlllrrl} \hline & Model & Lag.XRegs & Reccession & AIC & BIC & Best \ \hline 1 & ARIMA(1,2,1) & & & -212.29 & -201.45 & \ 2 & ARIMA(1,2,1) & Y & & -222.45 & -193.69 & \ 3 & VAR(1) & & Y & -253.31 & -217.15 & Best AIC/BIC \ \hline\end{tabular}\end{table}

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234738210&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=X9rBe9Oq0gVM-9jY2xiMGF8FwtCHgymwZTXBVuFJA0w&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK03rzKwOv6hQ3PlTHlyvkpJicKw35ks5qYng8gaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=YMwp7vuK4M-Nftg6MOZlfrnWHx4X18l85DjEks1rc0c&e= .

JestonBlu commented 7 years ago

Those are good points. I like that you found some supporting references.

JestonBlu commented 7 years ago

Here are the two equations without the insignificant variables. Im in favor of dropping out the insignificant variables even though it changes the long term forecast picture. If no one has a problem, im going to drop them in the code and rerun the tables (IndustrialProduction, ManufacturersNewOrders, HomePrices). Looks to me like the VAR(1) is the way to go.

VAR(1) Unemployment = .935 + .0041 t + .975 Unemployment{t-1} + .004 ConstructionSpend{t-1} - .005 RetailSales{t-1} + .19 RecessionIndicator{t-1}+ w_t AIC: -256, BIC: -231

ARIMA(1,2,1) Unemployment = -.2021 Unemployment{t-1} - .8078 w{t-1} + w_t AIC: -212, BIC: -201

SZRoberson commented 7 years ago

Even though there are more parameters, VAR(1) does seem the best. It incorporates some of our original ideas and beats everything else in AIC.

On the other hand, RetailSales and ConstructionSpend have small coefficients; do they really add much to the model?

On Jul 23, 2016 6:28 PM, "Joseph Blubaugh" notifications@github.com wrote:

Here are the two equations without the insignificant variables. Im in favor of dropping out the insignificant variables even though it changes the long term forecast picture. If no one has a problem, im going to drop them in the code and rerun the tables (IndustrialProduction, ManufacturersNewOrders, HomePrices). Looks to me like the VAR(1) is the way to go.

VAR(1) Unemployment = .935 + .0041 t + .975 Unemployment{t-1} + .004 ConstructionSpend{t-1} - .005 RetailSales{t-1} + .19 RecessionIndicator{t-1}+ w_t AIC: -256, BIC: -231

ARIMA(1,2,1) Unemployment = -.2021 Unemployment{t-1} - .8078 w{t-1} + w_t AIC: -212, BIC: -201

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234746839, or mute the thread https://github.com/notifications/unsubscribe-auth/AS8he4IxtqU0vfCoh4LKrAi_Omyxml2vks5qYqOHgaJpZM4Izj8R .

JestonBlu commented 7 years ago

Yeah. Keep in mind they are in different scales.

On Jul 23, 2016 6:30 PM, "Sean Roberson" notifications@github.com wrote:

Even though there are more parameters, VAR(1) does seem the best. It incorporates some of our original ideas and beats everything else in AIC.

On the other hand, RetailSales and ConstructionSpend have small coefficients; do they really add much to the model?

On Jul 23, 2016 6:28 PM, "Joseph Blubaugh" notifications@github.com wrote:

Here are the two equations without the insignificant variables. Im in favor of dropping out the insignificant variables even though it changes the long term forecast picture. If no one has a problem, im going to drop them in the code and rerun the tables (IndustrialProduction, ManufacturersNewOrders, HomePrices). Looks to me like the VAR(1) is the way to go.

VAR(1) Unemployment = .935 + .0041 t + .975 Unemployment{t-1} + .004 ConstructionSpend{t-1} - .005 RetailSales{t-1} + .19 RecessionIndicator{t-1}+ w_t AIC: -256, BIC: -231

ARIMA(1,2,1) Unemployment = -.2021 Unemployment{t-1} - .8078 w{t-1} + w_t AIC: -212, BIC: -201

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub < https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234746839 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AS8he4IxtqU0vfCoh4LKrAi_Omyxml2vks5qYqOHgaJpZM4Izj8R

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

trlilley12 commented 7 years ago

I the VAR(1) is good, too. For our final discussion, do we want to just focus on one model, or were we going to discuss both. I think it might be easier just to stick with one.

SZRoberson commented 7 years ago

I remember now, yes. Now I just need to gather some talking points.

On Jul 23, 2016 6:34 PM, "Joseph Blubaugh" notifications@github.com wrote:

Yeah. Keep in mind they are in different scales.

On Jul 23, 2016 6:30 PM, "Sean Roberson" notifications@github.com wrote:

Even though there are more parameters, VAR(1) does seem the best. It incorporates some of our original ideas and beats everything else in AIC.

On the other hand, RetailSales and ConstructionSpend have small coefficients; do they really add much to the model?

On Jul 23, 2016 6:28 PM, "Joseph Blubaugh" notifications@github.com wrote:

Here are the two equations without the insignificant variables. Im in favor of dropping out the insignificant variables even though it changes the long term forecast picture. If no one has a problem, im going to drop them in the code and rerun the tables (IndustrialProduction, ManufacturersNewOrders, HomePrices). Looks to me like the VAR(1) is the way to go.

VAR(1) Unemployment = .935 + .0041 t + .975 Unemployment{t-1} + .004 ConstructionSpend{t-1} - .005 RetailSales{t-1} + .19 RecessionIndicator{t-1}+ w_t AIC: -256, BIC: -231

ARIMA(1,2,1) Unemployment = -.2021 Unemployment{t-1} - .8078 w{t-1} + w_t AIC: -212, BIC: -201

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub <

https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234746839

,

or mute the thread <

https://github.com/notifications/unsubscribe-auth/AS8he4IxtqU0vfCoh4LKrAi_Omyxml2vks5qYqOHgaJpZM4Izj8R

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234747128, or mute the thread https://github.com/notifications/unsubscribe-auth/AS8heyGQBbunu6O9YQ7LKJI1jhW9JEgPks5qYqUegaJpZM4Izj8R .

JestonBlu commented 7 years ago

I think we want to present one model ultimately, but I also think that part of the process is how we went about selecting the model we chose. Maybe mention it more in the write up than the final presentation. I dont know.

On Sat, Jul 23, 2016 at 6:35 PM, trlilley12 notifications@github.com wrote:

I the VAR(1) is good, too. For our final discussion, do we want to just focus on one model, or were we going to discuss both. I think it might be easier just to stick with one.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234747154, or mute the thread https://github.com/notifications/unsubscribe-auth/ADL2haLIOIJO-nbtnmAnNtrrRH6zWcgsks5qYqVLgaJpZM4Izj8R .

sheltonmath commented 7 years ago

I am working on the final write-up right now - because I think it will be easier to build the final presentation from that. I have questions about some of the data. Where did the recession indicator data come from?

Also, the draft introduction that I have so far is:

Unemployment has been a topic of concern throughout the United States in recent years. The Great Recession iof 2007 was accompanied the worst unemployment crises seen since the 1930s (Wanberg, 2012). The results have been enduring, in 2010 the US job deficit was es- timated to be over 10 million (Katz, 2010). Graduate and Undergraduate college students alike are concerned over their employment prospects, wondering if their degrees will be enough to gain them a job after graduation. These worries are well-founded as full-reovery of college graduate employment rates and earning is expected to be a slow process Carnevale and Cheah (2015). In these times of economic uncertainty, obtaining an income gen- erating position is not the guarantee it has seemed to be in generations past.

Unemployment has far-reaching consequences that extends beyond financial security. Unemployment is linked to psychological difficulties, including depression and suicide, and even physical deterioration (Wanberg, 2012; Kim and von dem Knesebeck, 2015; DeFina and Han- non, 2015). A study of Greek students found a relationship between parental unemployment and PTSD symp- toms related to bullying (Kanellopoulos et al., 2014). In Nigeria, unemployment has been linked to insurgency and terrorism (Akanni, 2014). Given the impact that unemployment has on fiscal, mental, and physical health, reasearch into unemployment patterns an important part of developing policies to improve the welfare of the local, national, and global populace.

1.1 Goal The purpose of our project is to examine trends in un- employment in the United States. We will focus on the years surrounding the Great Recession of 2007, 1992 to 2015. Our goal is to forcast unemployment into 2016.

I am trying to finalize the data section right now but I thought I'd share this. Am I missing anything important from the beginning or goal?

sheltonmath commented 7 years ago

I like the VAR(1) model too, but we definitely need to talk about all of them in the write-up. I have gathered all the conversations from these discussions into a file and am trying to work the process through into a more logical order.

JestonBlu commented 7 years ago

The indicator came from the national bureau of economic research. Here is the citation link. http://www.nber.org/cycles/sept2010.html

On Jul 23, 2016 7:27 PM, "Alison" notifications@github.com wrote:

I like the VAR(1) model too, but we definitely need to talk about all of them in the write-up. I have gathered all the conversations from these discussions into a file and am trying to work the process through into a more logical order.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234748981, or mute the thread https://github.com/notifications/unsubscribe-auth/ADL2hW9yTaZsFAY-5GDSi10T4MGEJ4zOks5qYrGMgaJpZM4Izj8R .

sheltonmath commented 7 years ago

The VAR models in the literature have been outperforming the ARIMA models significantly. Although some of the more recent articles are using VAR to model different predictors I still think it is good justification. For example:

(Barnichon & Garda, 2016) "Finally, the large improvements in forecasting performances were obtained with simple VAR-based forecasts of the worker flows. "

(Meyer & Tasci, 2015) "So far our results indicate that the VAR model delivers the most accurate forecasts for up to 2 quarters ahead, and the FLOW-UC model presents the most potential for the farther horizons,"

sheltonmath commented 7 years ago

Thank you Joseph.

SZRoberson commented 7 years ago

I'll gather a list of points that I may want to mention and post it tomorrow morning before I return to College Station.

On Sat, Jul 23, 2016 at 6:40 PM, Joseph Blubaugh notifications@github.com wrote:

I think we want to present one model ultimately, but I also think that part of the process is how we went about selecting the model we chose. Maybe mention it more in the write up than the final presentation. I dont know.

On Sat, Jul 23, 2016 at 6:35 PM, trlilley12 notifications@github.com wrote:

I the VAR(1) is good, too. For our final discussion, do we want to just focus on one model, or were we going to discuss both. I think it might be easier just to stick with one.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234747154 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ADL2haLIOIJO-nbtnmAnNtrrRH6zWcgsks5qYqVLgaJpZM4Izj8R

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-234747347, or mute the thread https://github.com/notifications/unsubscribe-auth/AS8he0RNjt6uNEmhjfR2v1pTe71tRnaXks5qYqZ3gaJpZM4Izj8R .

sheltonmath commented 7 years ago

Thank you, that would be helpful. If it helps you, here is a copy of my current draft. I haven't started typing in the model selection information yet. I will keep working on it. main.pdf

JestonBlu commented 7 years ago

Updated the VAR to not include the insignificant variables I mentioned. The plots in All_Final_Models.r will reflect this... here are the updated tables now that those variables have been dropped. This matches the VAR equation i posted yesterday.

ARIMA Compare (no changes)

Model Order Xregs Lag.Xregs AIC BIC Best
Mdl.1 1,2,1 -212.2957 -201.4563 Best BIC
Mdl.2 2,2,2 -211.8094 -193.7438
Mdl.3 3,2,3 -215.4772 -190.1853
Mdl.4 1,2,1 Y -211.5564 -182.6514
Mdl.5 2,2,2 Y -209.8342 -177.3160
Mdl.6 3,2,3 Y -215.0983 -171.7408
Mdl.7 1,2,1 Y -222.4520 -193.6943 Best AIC
Mdl.8 2,2,2 Y -220.7001 -188.3477
Mdl.9 3,2,3 Y -217.8920 -174.7555

Compare VAR

Model P Lag.Xregs Recession.Ind AIC BIC Best
Mdl.1 1 -223.6686 -201.9680
Mdl.2 2 -217.8281 -185.3099
Mdl.3 1 Y -256.7669 -231.4495 Best BIC/AIC
Mdl.4 1 Y -216.6464 -195.0558
Mdl.5 2 Y -212.5259 -180.1735
Mdl.6 1 Y Y -245.7239 -220.5349

Compare Best Models

Model Lag.XRegs Reccession AIC BIC Best
ARIMA(1,2,1) -212.29 -201.45
ARIMA(1,2,1) Y -222.45 -193.69
VAR(1) Y -256.76 -231.45 Best AIC/BIC

Latex of the tables above

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sun Jul 24 08:35:48 2016
\begin{table}[ht]
\centering
\begin{tabular}{rllllrrl}
  \hline
 & Model & Order & Xregs & Lag.Xregs & AIC & BIC & Best \\ 
  \hline
1 & Mdl.1 & 1,2,1 &  &  & -212.30 & -201.46 & Best BIC \\ 
  2 & Mdl.2 & 2,2,2 &  &  & -211.81 & -193.74 &  \\ 
  3 & Mdl.3 & 3,2,3 &  &  & -215.48 & -190.19 &  \\ 
  4 & Mdl.4 & 1,2,1 & Y &  & -211.56 & -182.65 &  \\ 
  5 & Mdl.5 & 2,2,2 & Y &  & -209.83 & -177.32 &  \\ 
  6 & Mdl.6 & 3,2,3 & Y &  & -215.10 & -171.74 &  \\ 
  7 & Mdl.7 & 1,2,1 &  & Y & -222.45 & -193.69 & Best AIC \\ 
  8 & Mdl.8 & 2,2,2 &  & Y & -220.70 & -188.35 &  \\ 
  9 & Mdl.9 & 3,2,3 &  & Y & -217.89 & -174.76 &  \\ 
   \hline
\end{tabular}
\end{table}

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sun Jul 24 08:35:48 2016
\begin{table}[ht]
\centering
\begin{tabular}{rllllrrl}
  \hline
 & Model & P & Lag.Xregs & Recession.Ind & AIC & BIC & Best \\ 
  \hline
1 & Mdl.1 & 1 &  &  & -223.67 & -201.97 &  \\ 
  2 & Mdl.2 & 2 &  &  & -217.83 & -185.31 &  \\ 
  3 & Mdl.3 & 1 &  & Y & -256.77 & -231.45 & Best BIC/AIC \\ 
  4 & Mdl.4 & 1 & Y &  & -216.65 & -195.06 &  \\ 
  5 & Mdl.5 & 2 & Y &  & -212.53 & -180.17 &  \\ 
  6 & Mdl.6 & 1 & Y & Y & -245.72 & -220.53 &  \\ 
   \hline
\end{tabular}
\end{table}

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sun Jul 24 08:35:48 2016
\begin{table}[ht]
\centering
\begin{tabular}{rlllrrl}
  \hline
 & Model & Lag.XRegs & Reccession & AIC & BIC & Best \\ 
  \hline
1 & ARIMA(1,2,1) &  &  & -212.29 & -201.45 &  \\ 
  2 & ARIMA(1,2,1) & Y &  & -222.45 & -193.69 &  \\ 
  3 & VAR(1) &  & Y & -256.76 & -231.45 & Best AIC/BIC \\ 
   \hline
\end{tabular}
\end{table}
bopangpsy commented 7 years ago

Just want to add a bit.

The professor seems to like the idea of splitting the data into training and validation sets. We didn't split the data but luckily we have the new 5 months data as a validation set. From looking at the plots, it seems hard to distinguish the performance of two models. I computed the mean squared error of forecasting of the two best models. 0.01505823 for ARIMA(1,2,1) and 0.009663836 for VAR(1). This quantitative measure also supports this VAR(1) model. Hope this would help a bit when we are comparing the two models.

sheltonmath commented 7 years ago

I think we are all agreed on the VAR model.

I am working on wording our online discussions and putting it into the writeup.

I am working on taking my notes and the online discussions and putting them together offline. But, here is the version that has all the group discussion notes in the appendix.

The introduction is relatively fleshed out (please let me know if you want me to add anything or if I made any mistakes.)

Draft.pdf

bopangpsy commented 7 years ago

Sure, we all like the VAR model. I was just trying to add a side note when we talk about the comparison between the ARIMA and VAR models.

Thank you for putting all together. It looks good. We just need to refine it.

sheltonmath commented 7 years ago

Still working on it.

And, I appreciate what you said about the two models. I just want to make sure that we are all on the same page. The side notes are very helpful and I added what you said into my document.

On Sun, Jul 24, 2016 at 7:04 PM, bopangpsy notifications@github.com wrote:

Sure, we all like the VAR model. I was just trying to add a side note when we talk about the comparison between the ARIMA and VAR models.

Thank you for putting all together. It looks good. We just need to refine it.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234820853&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=AjW3D3Qr4P-BH0s-U0eGy1sCYCxJh4ZtY_jbVFwyC3Y&s=CB2BWgl4PQDnNDyqfGcxwQ9-ojSSJDrVHBFCdYSdots&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK04BQBDgJtkTLaiSNf8BB1AzNIJgbks5qZBnFgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=AjW3D3Qr4P-BH0s-U0eGy1sCYCxJh4ZtY_jbVFwyC3Y&s=FHyX6PvDADALeucr7w48A5Oe4L0yq4W844ceRbf8ltU&e= .

trlilley12 commented 7 years ago

This is a really good graph.

image

The ARIMA(1, 2, 1) predicts that unemployment will continue to decrease indefinitely, which we know can't be true. The VAR(1) model shows a much more accurate picture in the long run.

sheltonmath commented 7 years ago

Nice point.

On Mon, Jul 25, 2016 at 7:07 AM, trlilley12 notifications@github.com wrote:

This is a really good graph.

[image: image] https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_19874823_17104102_0690c392-2D5247-2D11e6-2D8e9b-2D37de36b6851e.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=pYoB_MQ6PRx1Pt6IgbtZMoRLok7rBG13r85KaFhira4&s=5Gl-2FIqg7ldjGxwZRbpsZaiHW2jwKAmOKqrqVjxV-Q&e=

The ARIMA(1, 2, 1) predicts that unemployment will continue to decrease indefinitely, which we know can't be true. The VAR(1) model shows a much more accurate picture in the long run.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234963151&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=pYoB_MQ6PRx1Pt6IgbtZMoRLok7rBG13r85KaFhira4&s=iWgfx1SVNpSUEgWdlzb59QWtZn--bZjLGLaQh2U3odo&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK05dERbLpREKTJE-2DUYQKnD5XVcReCks5qZMMogaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=pYoB_MQ6PRx1Pt6IgbtZMoRLok7rBG13r85KaFhira4&s=VOm7zM9we3Efsy99eDGWVj3aoOyx4D7_WwQtU7I6c18&e= .

sheltonmath commented 7 years ago

What everyone has done so far is great.

I have the write-up fairly complete through the discussion of the VAR models (I have not fixed the tables yet.) I am working on writing up the forcasting section and just stopped by here to see if anyone had made any additional comments.

Once I get a complete first draft I will move it to overleaf in case anyone wants to make tweaks to it there. In the meantime, if you have changes that you want to make yourself, feel free to adjust them directly in the file uploaded to github.

Here is the current rendition of the first draft. draft2.pdf

SZRoberson commented 7 years ago

I take it the last section of this write up can be used for further refinements that can be done? If so, I feel that another interpretation is to include an indicator isElection or something.

On Jul 25, 2016 12:22 PM, "Alison" notifications@github.com wrote:

What everyone has done so far is great.

I have the write-up fairly complete through the discussion of the VAR models (I have not fixed the tables yet.) I am working on writing up the forcasting section and just stopped by here to see if anyone had made any additional comments.

Once I get a complete first draft I will move it to overleaf in case anyone wants to make tweaks to it there. In the meantime, if you have changes that you want to make yourself, feel free to adjust them directly in the file uploaded to github.

Here is the current rendition of the first draft. draft2.pdf https://github.com/JestonBlu/STAT626_PROJECT/files/382068/draft2.pdf

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JestonBlu/STAT626_PROJECT/issues/3#issuecomment-235021330, or mute the thread https://github.com/notifications/unsubscribe-auth/AS8hezXugBxVveT4SG8gIVcTxDnlUHDUks5qZPDmgaJpZM4Izj8R .

JestonBlu commented 7 years ago

Its looking good so far.. i remember that the professor sent out a note about needing to know the specifics about what everyone worked on... should be just list that in the appendix or something? Not sure where that fits.

SZRoberson commented 7 years ago

Appendix, maybe.

My only worry about this last presentation is if I'll do the work justice.