jeffgortmaker / pyblp

BLP Demand Estimation with Python
https://pyblp.readthedocs.io
MIT License
239 stars 82 forks source link

Support discrete types #21

Open jeffgortmaker opened 5 years ago

jeffgortmaker commented 5 years ago

Like in the original Berry, Carnall, and Spiller (1996). This would require optimization over the integration weights (i.e., type shares).

bshiller commented 5 years ago

adding discrete types would be helpful, thus allowing for a bimodal distribution. Thanks for all the work you have put into this.

jeffgortmaker commented 5 years ago

Thanks. This seems like a popular request. It's high up on my to-do list -- I'll update this thread when I get around to it.

AvyG commented 1 year ago

Do you have any updates on this? I'm interested in testing a simulation with a two-point distribution for the taste distribution, but I'm unsure how to modify the code accordingly. Any guidance or updates would be much appreciated.

jeffgortmaker commented 1 year ago

No updates yet. The challenge is that implementing this in a flexible way would require a lot of additions. For example:

  1. A new parameter vector, say tau, with visual output.
  2. Replace weights in agent_data with some function of tau and demographics. Here, demographic dummies could encode different observed or unobserved types.
  3. Handle the standard issue that unobserved types are often exchangeable, so you want to impose some ordering on tau or the weights that it generates. Otherwise you get multiple (observationally equivalent) global optima. Maybe this is ok.
  4. Gradients with respect to tau.
  5. Make sure optimal IVs work.
  6. Would importance sampling still make sense?
  7. Delta method to get some type of interpretable weights estimates?
  8. Add simulations to the unit tests with discrete types to be sure everything else works.

I'm not sure if the above is the best way of doing this, but it's one thought.

If you're interested in implementing something like this, the place to start is parameters.py, where abstract parameter classes are defined. A new tau parameter would act a lot like pi and rho, so following where these names show up in the code will point to what else would need to be modified.

chrisconlon commented 1 year ago

As Jeff describes, the general case (unknown types, unknown weights) finite mixture is pretty hard to implement.

For a specific case with two types, you could imagine a demographic variable called is "business_travelers" that is {0,1}.

If you happened to know the fraction of that type estimation would be straightforward with the existing PyBLP implementation.

If you had any micro moments that were useful in pinning down either the fraction of types or the coefficients corresponding to each type that would certainly make life easier.

AvyG commented 1 year ago

Thank you for your answers.

Indeed, I was looking to estimate the case when the consumer can be only of two types (A and B). At the beginning, I will not use demographic data, and I will simulate the market data, so I am free to define the consumer's distribution (I think).

I have a question relating to the introduced parameter, tau. I thought that tau would represent the fraction of the two-point distribution, like sigma represents the variability of consumers' tastes. But you mention that tau would act as pi and rho, so I am no longer sure about that.

Thank you again for your time.

jeffgortmaker commented 1 year ago

Right, I mentioned that tau will act a lot like pi and rho, in the sense that it's a nonlinear parameter (like pi, and sigma too), and doesn't just show up in utility (like rho).

It won't act the same as them -- that's just a pointer for where to look in the code to see what you'd have to modify to add something like tau.