Enhance model order selection

cbrnr commented 8 years ago

Currently, model order selection is only possible via GCV. I think it would be great to have some alternative procedures available (for example because GCV tends to be very slow). We should have at least AIC, BIC, and FPE. The SIFT manual also lists the Hannan-Quinn (HQ) criterion. I don't know if these measures are faster, but they're commonly used in model selection. What do you think @mbillingr?

Furthermore, is there a reason why VAR.optimize equals VAR.optimize_delta_bisection? I think this is a bit confusing, since it is not clear what people associate with optimizing a VAR model. This also concerns Workspace.optimize_var.

mbillingr commented 8 years ago

What I don't like about these measures it that you have to choose one :) For example, I have absolutely no idea when I should use AIC and when BIC instead. Of course, you could argue that using CV is also a choice, but I arrogantly believe that it is most directly related to the problem. (Btw, are we really using GCV, and not simply CV?)

Nevertheless, providing alternative measures makes sense for the sake of choice. Performance is likely to benefit from these, too. Which measure should be the default?

The method names could indeed be improved. In the beginning, there was only VAR.optimize. Then there was need for an additional optimize_order. For consistency, I renamed optimize to optimize_delta_bisection, but kept the old name around to be backwards compatible. At least, that' how I believe I remember things :)

cbrnr commented 8 years ago

What I don't like about these measures it that you have to choose one :) For example, I have absolutely no idea when I should use AIC and when BIC instead. Of course, you could argue that using CV is also a choice, but I arrogantly believe that it is most directly related to the problem.

Right, I totally agree. But we don't have to decide which measure to pick, we just provide the options.

(Btw, are we really using GCV, and not simply CV?)

No idea - you implemented it!

Nevertheless, providing alternative measures makes sense for the sake of choice. Performance is likely to benefit from these, too. Which measure should be the default?

That's a good question (and relates to the method names). Like in the case of the whiteness test, where we actually have three tests, I'd prefer to have three separate functions/methods (remember the Unix philosophy?). For convenience, we could then have one wrapper function/method that calls all related methods, possibly with one default (and the rest optional). I don't know. Or we could have one method that accepts the measure as an argument.

Do we need to choose a default model order criterion? I guess this depends on how we implement it. If we need a default, I'd go with AIC, probably because it's very popular (which does not mean it's the best).

mbillingr commented 8 years ago

No idea - you implemented it!

Did I call it GCV?

I'd prefer to have three separate functions/methods

I agree.

Do we need to choose a default model order criterion?

In some way we do. Even if we just provide different functions we have to document when/how to use them.

cbrnr commented 8 years ago

Did I call it GCV?

Ups. Getting confused with all the "generalized" stuff out there (MSGE).

I'd prefer to have three separate functions/methods I agree.

And a wrapper method to wrap them all?

In some way we do. Even if we just provide different functions we have to document when/how to use them.

Right. We can list some properties of the criterions, which could provide the basis for our recommendation of a default criterion. Lütkepohl should have all that stuff, but I don't have the book anymore.

However, referring to #131, it looks like even if we implement AIC/BIC/FPE..., we still need to do this within a CV, right? So these procedures wouldn't be any faster. Or are they designed to work on the MSE (of the whole data) because they include a penalty term for the number of parameters?

mbillingr commented 8 years ago

And a wrapper method to wrap them all?

Wrapping is good, but then we have to choose a default :)

Or are they designed to work on the MSE (of the whole data) because they include a penalty term for the number of parameters?

Exactly, but for the same reason they do not work well with regularization.

cbrnr commented 8 years ago

Wrapping is good, but then we have to choose a default :)

Or the wrapper computes all criteria and returns a list/dict of results.

mbillingr commented 8 years ago

Sounds hellish - at least if performance is an issue :) I would at least provide an option to select one or some criteria.

cbrnr commented 8 years ago

optimize_order(bic=True, aic=True, fpe=True, hq=True)?

mbillingr commented 8 years ago

or optimize_order(['bic', 'aic', 'fpe', 'hq'])? I don't know which is better.

Btw, I would not call this function optimize_order if it does not actively change the model order, which it can't do because it doesn't know which of the multiple measures to use.

cbrnr commented 8 years ago

or optimize_order(['bic', 'aic', 'fpe', 'hq'])? I don't know which is better.

I kind of like the variant with several arguments better than the one with a list. I don't trust functions that require mutable input arguments :-).

Do you have a suggestion for a better name? I don't think the name implies that anything is changed. It just optimizes the model order with different criteria.

mbillingr commented 8 years ago

I kind of like the variant with several arguments better than the one with a list. I don't trust functions that require mutable input arguments :-)

Tuple! :p

I think I prefer optimize_order(bic=True, aic=True, fpe=True, hq=True) because you can also pass a **dict. It would be nice if it was possible to just pass the arguments without the =True to enable them (R-style)... but nevermind, it's Python and that's not possible.

Do you have a suggestion for a better name? I don't think the name implies that anything is changed. It just optimizes the model order with different criteria.

At least in the current API this is implied by the name because all optimize_* functions behave like that. I would like to keep it that way, because writing var.optimize(data) translates to literally "optimize this model for the data". In contrast, the proposed function would literally do a "give me the selection criteria for this model". What about something like get_criteria?

cbrnr commented 8 years ago

Tuple! :p

Well played :-).

I think I prefer optimize_order(bic=True, aic=True, fpe=True, hq=True) because you can also pass a **dict. It would be nice if it was possible to just pass the arguments without the =True to enable them (R-style)... but nevermind, it's Python and that's not possible.

OK, so let's go for this interface then. Regarding R-style, in theory you could do

bic, aic, fpe, hq = (True,) * 4
optimize_order(bic, aic, fpe, hq)

There you go! :-)

What about something like get_criteria?

I don't like getters, they are so Java-style. What about model_order_selection? Or order_selection, select_order, find_optimal_order?

mbillingr commented 8 years ago

I don't like getters, they are so Java-style. What about model_order_selection? Or order_selection, select_order, find_optimal_order?

What is the function going to do? If it returns the criteria for varying model orders something like model_order_selection would fit. If, instead, it returns the optimal ps I'd go for find_optimal_order.

cbrnr commented 8 years ago

Good question. I guess it would be best if the function returned the values for a range of model orders, because usually there is no clear optimal value. That way, users could plot the various criteria over the model order and decide for themselves which p to pick.

mbillingr commented 8 years ago

I agree. What would be a good name for such a function?

cbrnr commented 8 years ago

X_model_order - we just need to find a good X.

select? choose? There must be a good word for what the function is doing...

mbillingr commented 8 years ago

Hmm.. selection or choice are up to the user. What the function is doing is calculate_model_order_criteria, but that is a bit too long :)

cbrnr commented 8 years ago

True. Can we skip the calculate_ part?

mbillingr commented 8 years ago

Yes. Although, something in the back of my mind keeps nagging that method names should start with a verb...

cbrnr commented 8 years ago

Yes, you're right. But we can't have such a long name. show_model_order_criteria - every word is important, or is it? It would be nice to get rid of the word criteria.

cbrnr commented 8 years ago

evaluate_model_order? list_model_order? compare_model_order? assess_model_order?

mbillingr commented 8 years ago

I'd say model is the least important word, but from above list I'd prefer evaluate_model_order.

cbrnr commented 8 years ago

But I think "model" and "order" belong together, so we can't drop model. So let's go for evaluate_model_order(bic=True, aic=True, fpe=True, hq=True) then.

Do we call the wrapped methods evaluate_bic, evaluate_aic, evaluate_fpe, and evaluate_hq then?

mbillingr commented 8 years ago

I was thinking whether the wrapped functions should be methods of VAR/VARBase or would it make sense to place them in a separate file to avoid cluttering the VAR classes?

cbrnr commented 8 years ago

Somehow they belong to the VAR class, don't they? Of course we could also make them independent functions and put them in a separate file. I don't know. What's your opinion?

mbillingr commented 8 years ago

I prefer independent functions if they can be written so that they do not depend on the VAR class. If they need access to a VAR instance methods are better.

cbrnr commented 8 years ago

Me too. In this case, all we need is the model order range and the data, right? So we should implement them as functions in a separate module. Any suggestion for the module name? Or do they fit into an existing module?

mbillingr commented 8 years ago

What about information_criteria.py or model_criteria.py?

cbrnr commented 8 years ago

Or model_selection.py?

mbillingr commented 8 years ago

That would work too.

You don't seem to like criteria? I'm not a fan of selection :)

cbrnr commented 8 years ago

Yes, but I can't really tell you why. model_criteria.py is good enough, people will hopefully not be too surprised to find model order selection criteria in there :-).

mbillingr commented 8 years ago

Same here, I feel that selection does not quite fit, but cannot tell you why.

I'd say the implementer can choose the name :)

cbrnr commented 8 years ago

Well, we both agree that the full version should be selection_criteria (but it's too long), so it's only natural that both shorter versions do not quite fit.

Yes, let the implementer choose the name. To avoid duplicate work, let's tell each other if one of us starts working on it, OK :-)?

mbillingr commented 8 years ago

Sure. We can simply create an empty PR as soon someone is working on an issue. Or we could use the Github feature and set an Assignee :)

cbrnr commented 8 years ago

Let's use the assignee feature - as soon as someone starts working on an issue, let's make sure to set the assignee to self :-)!

cbrnr commented 7 years ago

Lütkepohl lists the following criteria (we should implement them all):

FPE (Final Prediction Error)
AIC (Akaike's Information Criterion)
HQ (Hannan-Quinn criterion)
SC (Schwarz criterion, also called BIC or Bayesian Information Criterion)
LR (Likelihood Ratio test), not sure if this is relevant

In addition, we could add the MDL (Minimum Description Length).

mbillingr commented 7 years ago

If they are listet by Lütkephl they should be straight-forward to implement. I wish there was the time to add some Bayesian estimation techniques, too...

cbrnr commented 7 years ago

I think there is a section on Bayesian estimation in Lütkepohl. However, I think we should focus on the first four measures mentioned above before implementing anything else.

mbillingr commented 7 years ago

Yes, I totally agree.

scot-dev / scot

Enhance model order selection #130