Nested logit needs to avoid local optima in estimation

mattwigway commented 3 months ago

Unlike the multinomial logit, the nested logit log-likelihood function has has local minima. The current implementation (on the nested-logit) branch treats the IV terms just like any other term, which means getting stuck in a local minimum can happen (and, I expect, is pretty common, particularly in high dimensions—it's happening with a model I'm working with now).

Daganzo and Kusnic (1993) point out that the likelihood function is concave in all parameters except the IV terms, and argue this may lead to better estimation. Maybe we can treat the IV terms specially in the model, e.g. a systematic search over [0, 1] based on parameters from an MNL. Or estimate the parameters with MNL first holding the IV terms constant, then use that as starting values, though that might also bias the result into a local minumum. Maybe we just estimate with MNL, then fix the IV terms and estimate for 0:1.0:0.1 then refine from there? Though that will get sticky with multiple IV terms. Someone must have thought about this before.

mattwigway commented 3 months ago

Ben-Akiva and Lerman describe a sequential estimation procedure followed by a single Newton-Raphson iteration; maybe this is a good approach.

mattwigway commented 2 months ago

Returning to this, I'm not sure how necessary this is—the model I was working with before was ill-specified (though the NL did get stuck in a local minimum where the loglikelihood was lower than the MNL solution). But with a well specified NL model it might not happen.

mattwigway / DiscreteChoiceModels.jl

Nested logit needs to avoid local optima in estimation #29