timothyb0912 / pylogit

A python package for estimating conditional logit models.
https://pypi.org/project/pylogit/
BSD 3-Clause "New" or "Revised" License
185 stars 103 forks source link

relation to softmax #8

Closed Eh2406 closed 7 years ago

Eh2406 commented 7 years ago

Hi,

Yesterday I reread the paper. I woke up thinking that the logit-type formulation is similar to the softmax normalization. I am a little surprised the comparison was not in the paper.

This is not a call to action, but the start of a conversation. I was going to send you an email, but thot may as well work in public.

To what degree can the paper be described as:

timothyb0912 commented 7 years ago

Thanks for reading the paper!

The logit-type formulation is similar to softmax. The main difference is that softmax models typically don't allow for variables that are alternative-specific (e.g. travel_time_bike being only in the bicycle utility and travel_time_car being only in the car utility). I don't mention this in the paper, but it's in the beginning of the pylogit introduction/computation document.

Also, the issue of calling S(data) a systematic utility is subtle. It is similar to the way all squares are rectangles but not all rectangles are square. To be concrete, this terminology of systematic utilities is often used in cases where one's random utility U, can be additively decomposed into a "systematic utility", V, and a random error that is Gumbel distributed.

For sure, one can assume that one's random utility follows a Gumbel distribution and that the systematic utility V = S(data). You'll then arrive at the logit-type models that I introduce. However, just because one uses a logit-type model doesn't mean that one's systematic utility is S(data). Mogens Fosgerau and Michel Bierlaire have a great example where the random utility is U = V*epsilon with epsilon being weibull distributed and the systematic utility being V = X*Beta. Here one still ends up with a logit type model, even though S(data) is not the systematic utility.

Lastly, S can be linear if one wants. You'll just end up with the typical MNL model.

Eh2406 commented 7 years ago

The introduction to pylogit_computation is very helpful. So, if I understand.

It is a softmax function to normalize some model, but the term softmax regression connotes a model with restrictions on parameters that are not implied by logit-type. Similarly calling the argument to softmax a utility connotes an additive error which is not implied by S(data).

I wish we did not have to invent new names for the same formulas every time we want to imply different things, but that is life. Thanks for the explanation.

timothyb0912 commented 7 years ago

No worries. Thanks for engaging with the topic! I think your last summary is totally on point.

In any case, I also wish we didn't have so much terminology for the same (or very similar) models. C'est la vie.