Open cbrnr opened 8 years ago
What I don't like about these measures it that you have to choose one :) For example, I have absolutely no idea when I should use AIC and when BIC instead. Of course, you could argue that using CV is also a choice, but I arrogantly believe that it is most directly related to the problem. (Btw, are we really using GCV, and not simply CV?)
Nevertheless, providing alternative measures makes sense for the sake of choice. Performance is likely to benefit from these, too. Which measure should be the default?
The method names could indeed be improved.
In the beginning, there was only VAR.optimize
. Then there was need for an additional optimize_order
. For consistency, I renamed optimize
to optimize_delta_bisection
, but kept the old name around to be backwards compatible. At least, that' how I believe I remember things :)
What I don't like about these measures it that you have to choose one :) For example, I have absolutely no idea when I should use AIC and when BIC instead. Of course, you could argue that using CV is also a choice, but I arrogantly believe that it is most directly related to the problem.
Right, I totally agree. But we don't have to decide which measure to pick, we just provide the options.
(Btw, are we really using GCV, and not simply CV?)
No idea - you implemented it!
Nevertheless, providing alternative measures makes sense for the sake of choice. Performance is likely to benefit from these, too. Which measure should be the default?
That's a good question (and relates to the method names). Like in the case of the whiteness test, where we actually have three tests, I'd prefer to have three separate functions/methods (remember the Unix philosophy?). For convenience, we could then have one wrapper function/method that calls all related methods, possibly with one default (and the rest optional). I don't know. Or we could have one method that accepts the measure as an argument.
Do we need to choose a default model order criterion? I guess this depends on how we implement it. If we need a default, I'd go with AIC, probably because it's very popular (which does not mean it's the best).
No idea - you implemented it!
Did I call it GCV?
I'd prefer to have three separate functions/methods
I agree.
Do we need to choose a default model order criterion?
In some way we do. Even if we just provide different functions we have to document when/how to use them.
Did I call it GCV?
Ups. Getting confused with all the "generalized" stuff out there (MSGE).
I'd prefer to have three separate functions/methods I agree.
And a wrapper method to wrap them all?
In some way we do. Even if we just provide different functions we have to document when/how to use them.
Right. We can list some properties of the criterions, which could provide the basis for our recommendation of a default criterion. Lütkepohl should have all that stuff, but I don't have the book anymore.
However, referring to #131, it looks like even if we implement AIC/BIC/FPE..., we still need to do this within a CV, right? So these procedures wouldn't be any faster. Or are they designed to work on the MSE (of the whole data) because they include a penalty term for the number of parameters?
And a wrapper method to wrap them all?
Wrapping is good, but then we have to choose a default :)
Or are they designed to work on the MSE (of the whole data) because they include a penalty term for the number of parameters?
Exactly, but for the same reason they do not work well with regularization.
Wrapping is good, but then we have to choose a default :)
Or the wrapper computes all criteria and returns a list/dict of results.
Sounds hellish - at least if performance is an issue :) I would at least provide an option to select one or some criteria.
optimize_order(bic=True, aic=True, fpe=True, hq=True)
?
or optimize_order(['bic', 'aic', 'fpe', 'hq'])
? I don't know which is better.
Btw, I would not call this function optimize_order
if it does not actively change the model order, which it can't do because it doesn't know which of the multiple measures to use.
or optimize_order(['bic', 'aic', 'fpe', 'hq'])? I don't know which is better.
I kind of like the variant with several arguments better than the one with a list. I don't trust functions that require mutable input arguments :-).
Do you have a suggestion for a better name? I don't think the name implies that anything is changed. It just optimizes the model order with different criteria.
I kind of like the variant with several arguments better than the one with a list. I don't trust functions that require mutable input arguments :-)
Tuple! :p
I think I prefer optimize_order(bic=True, aic=True, fpe=True, hq=True)
because you can also pass a **dict. It would be nice if it was possible to just pass the arguments without the =True
to enable them (R-style)... but nevermind, it's Python and that's not possible.
Do you have a suggestion for a better name? I don't think the name implies that anything is changed. It just optimizes the model order with different criteria.
At least in the current API this is implied by the name because all optimize_*
functions behave like that. I would like to keep it that way, because writing var.optimize(data)
translates to literally "optimize this model for the data". In contrast, the proposed function would literally do a "give me the selection criteria for this model". What about something like get_criteria
?
Tuple! :p
Well played :-).
I think I prefer
optimize_order(bic=True, aic=True, fpe=True, hq=True)
because you can also pass a **dict. It would be nice if it was possible to just pass the arguments without the=True
to enable them (R-style)... but nevermind, it's Python and that's not possible.
OK, so let's go for this interface then. Regarding R-style, in theory you could do
bic, aic, fpe, hq = (True,) * 4
optimize_order(bic, aic, fpe, hq)
There you go! :-)
What about something like
get_criteria
?
I don't like getters, they are so Java-style. What about model_order_selection
? Or order_selection
, select_order
, find_optimal_order
?
I don't like getters, they are so Java-style. What about model_order_selection? Or order_selection, select_order, find_optimal_order?
What is the function going to do? If it returns the criteria for varying model orders something like model_order_selection
would fit. If, instead, it returns the optimal p
s I'd go for find_optimal_order
.
Good question. I guess it would be best if the function returned the values for a range of model orders, because usually there is no clear optimal value. That way, users could plot the various criteria over the model order and decide for themselves which p to pick.
I agree. What would be a good name for such a function?
X_model_order
- we just need to find a good X
.
select? choose? There must be a good word for what the function is doing...
Hmm.. selection or choice are up to the user.
What the function is doing is calculate_model_order_criteria
, but that is a bit too long :)
True. Can we skip the calculate_
part?
Yes. Although, something in the back of my mind keeps nagging that method names should start with a verb...
Yes, you're right. But we can't have such a long name. show_model_order_criteria
- every word is important, or is it? It would be nice to get rid of the word criteria
.
evaluate_model_order
? list_model_order
? compare_model_order
? assess_model_order
?
I'd say model
is the least important word, but from above list I'd prefer evaluate_model_order
.
But I think "model" and "order" belong together, so we can't drop model. So let's go for evaluate_model_order(bic=True, aic=True, fpe=True, hq=True)
then.
Do we call the wrapped methods evaluate_bic
, evaluate_aic
, evaluate_fpe
, and evaluate_hq
then?
I was thinking whether the wrapped functions should be methods of VAR/VARBase or would it make sense to place them in a separate file to avoid cluttering the VAR classes?
Somehow they belong to the VAR
class, don't they? Of course we could also make them independent functions and put them in a separate file. I don't know. What's your opinion?
I prefer independent functions if they can be written so that they do not depend on the VAR
class. If they need access to a VAR
instance methods are better.
Me too. In this case, all we need is the model order range and the data, right? So we should implement them as functions in a separate module. Any suggestion for the module name? Or do they fit into an existing module?
What about information_criteria.py
or model_criteria.py
?
Or model_selection.py
?
That would work too.
You don't seem to like criteria
? I'm not a fan of selection
:)
Yes, but I can't really tell you why. model_criteria.py
is good enough, people will hopefully not be too surprised to find model order selection criteria in there :-).
Same here, I feel that selection
does not quite fit, but cannot tell you why.
I'd say the implementer can choose the name :)
Well, we both agree that the full version should be selection_criteria
(but it's too long), so it's only natural that both shorter versions do not quite fit.
Yes, let the implementer choose the name. To avoid duplicate work, let's tell each other if one of us starts working on it, OK :-)?
Sure. We can simply create an empty PR as soon someone is working on an issue. Or we could use the Github feature and set an Assignee :)
Let's use the assignee feature - as soon as someone starts working on an issue, let's make sure to set the assignee to self :-)!
Lütkepohl lists the following criteria (we should implement them all):
In addition, we could add the MDL (Minimum Description Length).
If they are listet by Lütkephl they should be straight-forward to implement. I wish there was the time to add some Bayesian estimation techniques, too...
I think there is a section on Bayesian estimation in Lütkepohl. However, I think we should focus on the first four measures mentioned above before implementing anything else.
Yes, I totally agree.
Currently, model order selection is only possible via GCV. I think it would be great to have some alternative procedures available (for example because GCV tends to be very slow). We should have at least AIC, BIC, and FPE. The SIFT manual also lists the Hannan-Quinn (HQ) criterion. I don't know if these measures are faster, but they're commonly used in model selection. What do you think @mbillingr?
Furthermore, is there a reason why
VAR.optimize
equalsVAR.optimize_delta_bisection
? I think this is a bit confusing, since it is not clear what people associate with optimizing a VAR model. This also concernsWorkspace.optimize_var
.