mattwigway / DiscreteChoiceModels.jl

Discrete choice/random utility models in Julia
MIT License
13 stars 3 forks source link

fixed effects in formula? #24

Open Gkreindler opened 1 year ago

Gkreindler commented 1 year ago

Thank you for this great package! Is there a way to include a fixed effect term in utility?

In my application, I have neighborhood fixed effects, and there are many possibilities (>50) so I would prefer to not add them in the formula by hand.

Is there a way to construct the utility formula programmatically, i.e. in FixedEffectModels it's possible to do sum([term(Symbol("somevar" * string(i) for i=1:N]).

Gkreindler commented 1 year ago

A note that I can't figure out how to pass an expression to the @utility macro, and I think that doesn't work with macros in Julia (as they only have access to the "text" not the contents).

I am currently "hacking" this by

  1. redefining function utility(ex::Expr) instead of a macro
  2. defining some utility formula myu programmatically
  3. using eval(utility(myu)) as an argument of multinomial_logit I think this is probably terrible, but it works for me... for now.
mattwigway commented 1 year ago

There's currently no way to do this, and I think you're correct about the macros.

mattwigway commented 1 year ago

Thinking about this more, I think there are two ways to go about this:

  1. As you say, the macro doesn't have access to the data. But what the macro is generating is actually code that is executed at runtime, so that could theoretically have access to the data and create the fixed effects that way.
  2. You could change the multinomial logit code to handle fixed effects directly.
mattwigway commented 10 months ago

@Gkreindler I ran into this in a project I'm working on and kinda-sorta worked around it by using == in the utility function, e.g.

        "carpool" ~
            αcarpool_1990 * (year == 1990) +
            αcarpool_2000 * (year == 2000) +
            αcarpool_2009 * (year == 2009) +
            αcarpool_2014 * (year == 2014) +
            αcarpool_2019 * (year == 2019) +
            αcarpool_2021 * (year == 2021)

It's still rather awkward, and may have performance implications for things that aren't numbers. One option would be to add a macro @fe that basically just generates this code right inside the utility function—so you could write

"carpool" ~ @fe(αcarpool, year, [1990, 2000, 2009, 2014, 2019, 2021])

and have it expanded out like this. You would still have to specify the levels manually, but that's a lot less painful than the current situation.

Gkreindler commented 10 months ago

Thanks! For my application, I ended up coding by hand the MLE and bootstrap SEs for the simple case I had (binary logit) + analytic gradients.