jeffgortmaker / pyblp

BLP Demand Estimation with Python
https://pyblp.readthedocs.io
MIT License
228 stars 82 forks source link

Fail to read the pickle file in which the ProblemResults are saved. Which is the best way to save ProblemResults to do simulation #134

Closed QWangPKU closed 1 year ago

QWangPKU commented 1 year ago

Hi,

I ran into issues when I tried to save the ProblemResults and read it later to do a simulation or compute elasticity.

To avoid repeated estimation and save time, I saved the estimation results in a pickle file by "result.to_pickle()". I was able to read it and use it to compute elasticity or conduct simulations three months ago. The codes I used are as follows:

myfile = workingdata +'week_nl_results2.pickle'
file = open(myfile, 'rb')
# dump information to that file
results = pickle.load(file)

But when I used the same code to read it a few days ago, it didn't work and gave an error message below. image

I couldn't solve this and according to Google it implies that some underlying functions or signatures changed so the same code doesn't work. Weirdly, there is no updates in the past three months, right?

I tried to work around it by saving the ProblemResults as a dictionary with 'result.to_dict()'. It works fine if you just need to retrieve the estimated parameters or simple statistics. But if I want to compute elasticity or run simulations, it can't be done with the dictionary format.

image

I guess my question is: to save the ProblemResults and read it later to do a simulation or compute elasticity, what is the best method you suggest to use? to_pickle used to work but not now. And to_dict can not handle complicated post-estimation tasks.

It is highly appreciated if you can provide some guidance. Thank you!

jeffgortmaker commented 1 year ago

I released version 1.0.0 last month! See https://github.com/jeffgortmaker/pyblp/releases. Maybe try downgrading to the version that was recent as of when you saved the file?

If you haven't changed your PyBLP version recently, the issue may also be due to any other packages (e.g. PyBLP's dependencies and their dependencies, etc.) that you may have changed since then. It's hard to tell from your error message. If you're using Anaconda, which I recommend, conda list --revisions could be helpful.

But yes, generally speaking to_dict methods will be most robust to version changes. You can always re-initialize a Problem with the estimated parameter values from your saved dict, and get its ProblemResults by using optimization=Optimization('return') in Problem.solve.

QWangPKU commented 1 year ago

Thank you, Jeff! You are a big help.