jeffgortmaker / pyblp

BLP Demand Estimation with Python
https://pyblp.readthedocs.io
MIT License
228 stars 82 forks source link

Calculating Long Division Rate with Multiple Products Removed #161

Open cccccc281 opened 1 week ago

cccccc281 commented 1 week ago

Hello,

I need some help with calculating the long division rate when two products are removed from the choice set. The package documentation provides a method for calculating this rate when only one product is removed, but I'm not sure how to specify it in the function involving two products.

To address this, I tried to calculate the long division rate based on the formulas from Conlon and Mortimer (2018). Here’s my approach of getting the counterfactual share:

I created a dataset called counterfactual_data and set the shares of the two removed products to 0. I also set their prices to an extremely high value to effectively remove them from consideration, while keeping the prices of other products unchanged. I then used _ProblemResults.computeshares to calculate the new shares in counterfactualdata with the new prices. However, the calculated share S{-jkt} (after removing two products) turned out to be always smaller than S_{kt} (the original share), resulting in a negative numerator for the long division ratio.

I’m wondering if there’s a better way to compute the new shares and the long division ratio after removing two products. Are there any specific functions or methods in the package that I might have overlooked? Thank you for reading and any suggestions or guidance would be greatly appreciated!

jeffgortmaker commented 1 week ago

If you create a Simulation with your counterfactual_data (but just drop the two products, don't set their shares to zero which might create some numerical issues) and parameter estimates (including xi and any xi_fe for all but the two dropped products), you can use Simulation.replace_endogenous (perhaps fixing prices with a smart choice of initial values and iteration=pyblp.Iteration('return')) to calculate shares with the two removed.

Does that help? In general, all but the simplest counterfactuals are going to have to be run with a Simulation, and can't be run with ProblemResults alone. When setting up a Simulation and doing something like this, I usually recommend doing a "unit test" where you first make sure that when you don't drop the two products, the above procedure gives you basically the same shares as those you see in the data (up to numerical error).

chrisconlon commented 1 week ago

Are you trying something with or without nesting parameters?

Without nesting I think the formula should be pretty simple assuming I've correctly understood what you're trying to do.

-Chris

On Fri, Jul 5, 2024 at 12:22 PM Jeff Gortmaker @.***> wrote:

If you create a Simulation with your counterfactual_data (but just drop the two products, don't set their shares to zero which might create some numerical issues) and parameter estimates (including xi and any xi_fe for all but the two dropped products), you can use Simulation.replace_endogenous (perhaps fixing prices with a smart choice of initial values and iteration=pyblp.Iteration('return')) to calculate shares with the two removed.

Does that help? In general, all but the simplest counterfactuals are going to have to be run with a Simulation, and can't be run with ProblemResults alone. When setting up a Simulation and doing something like this, I usually recommend doing a "unit test" where you first make sure that when you don't drop the two products, the above procedure gives you basically the same shares as those you see in the data (up to numerical error).

— Reply to this email directly, view it on GitHub https://github.com/jeffgortmaker/pyblp/issues/161#issuecomment-2211137461, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7IOWM7BFI2I5UP26IZS3DZK3B5PAVCNFSM6AAAAABKKQLOACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJRGEZTONBWGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

cccccc281 commented 1 week ago

Thank you both for your prompt responses! I've tried what Jeff suggested and computed the counterfactual shares. However, I do not have the supply side information in my dataset, so I couldn't specify my X3 in the simulation. The 'Simulation.replace_endogenous' function requires either the X3 specification in the simulation or information about the cost. Currently, I am assuming the cost is zero. Do you have any suggestions on how to better handle the cost side issue?

Also, I realized it's complicated to specify fixed effects in the simulation part, so I didn't include them. Instead, I only included the FE parameter in my specification of parameter xi. Do you have better ways to handle this?

Thank you so much for your time!

Also, for Chris, thank you for asking. I haven't included nesting parameters, so the formula for the LD rate is simple.

jeffgortmaker commented 1 week ago

I recommend settings costs equal to your prices when running Simulation.replace_endogenous. By default, these will be the starting values for prices (see the docs), so if you use iteration=pyblp.Iteration('return'), this will just fix your prices at your observed ones.

If you absorbed FEs during estimation, you can find their values in ProblemResults.xi_fe. You can add this to ProblemResults.xi when specifying xi in your Simulation. See supplementary exercise 4 here for an example.