Closed kevindanray closed 3 years ago
To be clear -- you are estimating the cereal data with:
a. a fully dense 4 x 4 matrix of random coefficients (10 parameters in the lower triangular Cholesky root). b. no demographic variation or demographic interaction coefficients. c. the same packaged instruments as in the original Nevo simulated cereal example.
My initial responses are:
-Chris
On Fri, Feb 12, 2021 at 2:29 PM kevindanray notifications@github.com wrote:
I have been trying to use the quadrature rules for both my own project data and with the cereal pseudo-data (no agent data), and there are always problems.
With my project data, it basically crashes after one iteration. This may be explained by the issues observed with the cereal pseudo-data, as there seems to be many orders of magnitude in the difference between quadrature and QMC methods in the observable performance metrics.
For example, passing sigma = np.ones(4,4) for the cereal data problem gives us a first iteration as follows: MC: 2980 fixed point iterations, 9071 contraction evaluations, 6.5e03 objective value MLHS: 2955 fixed point iterations, 8986 contraction evaluations, 6.6e03 objective value Quad: 136437 fixed point iterations, 409201 contraction evaluations, 8.3e40 objective value
I have tested quadrature "sizes" ranging from 3 to 7, and the problem is persists.
Sometimes the quadrature method reports failure to converge, but other times it reports convergence to absurd objective values. For the cereal data, MC, MLHS, and Halton converge to approximately 1.7e02. Quadrature "converges" to 6.0e40 in spite of reporting the fixed point failing to converge and divide by zero errors at each of the 4 iterations.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jeffgortmaker/pyblp/issues/78, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7IOWLFQIBOWXDYXKRLJKLS6V6TJANCNFSM4XRG3HVQ .
A, B, and C are all correct. I have confirmed that the method DOES work when the diagonal Cholesky root is used. I am using the L-BFGS-B method with the default bounds, so it could perhaps be improved by tighter bounds.
But to reiterate, when I try it on my proprietary data set, it takes about 10 minutes before the first iteration fails (delta fails to converge) which seems to crash the entire operation. It may just be that the grid method and the triangular Cholesky root are incompatible.
Divide by zero errors in particular makes me think that Chris's point 2 might be going on. I'll see if I can reproduce your problem on the cereal data.
Ah, I think you're having issues with sparse grid integration (SGI), not all quadrature? E.g. simple product rules work fine with the cereal data, but I can replicate your problem with SGI (our grid
option).
Looks like at least for the cereal example that I can replicate, Chris is right about numerical issues. Specifically, the problem is that SGI often uses negative integration weights. This means that for "bad" parameter guesses (e.g. the initial ones here), we can get negative simulated market shares. This doesn't play nice with the BLP contraction, which requires log(shares), so our "solution" is to just clip shares from below, by default by 1e-300. I've found that in practice this can work well, both in some SGI problems I've seen, and more generally for dealing with underflow.
But it seems like your data/model (and the possibly non-identified cereal model with a dense sigma and few instruments) are counterexamples. The reason the contraction doesn't converge is because once we clip market shares from below, the contraction presumably ceases to be a contraction, and it may not converge (or may converge slowly -- who knows). Of course at the solution (mean utility that equates simulated and observed market shares), there is no clipping because observed market shares are all positive. But the problem is getting to that solution in the first place.
You have a few options, if you want to use quadrature:
agent_data
.fp_type
options ('nonlinear' doesn't require log(shares), so it may be more robust to negative weights), different Iteration
methods, or different delta
/other parameter starting values. If you can get a configuration that makes the non-contraction converge, then all should be good.Hope this helps. On our end, we should probably make error messages a bit more helpful here, since all you see is long-running code and a few division by zero messages. (And hopefully a nonzero "Clipped Shares" value in your optimization output, but that's admittedly nebulous without this explanation.)
I'm going to close this for now, but feel free to re-open / keep commenting.
I have been trying to use the quadrature rules for both my own project data and with the cereal pseudo-data (no agent data), and there are always problems.
With my project data, it basically crashes after one iteration. This may be explained by the issues observed with the cereal pseudo-data, as there seems to be many orders of magnitude in the difference between quadrature and QMC methods in the observable performance metrics.
For example, passing sigma = np.ones(4,4) for the cereal data problem gives us a first iteration as follows: MC: 2980 fixed point iterations, 9071 contraction evaluations, 6.5e03 objective value MLHS: 2955 fixed point iterations, 8986 contraction evaluations, 6.6e03 objective value Quad: 136437 fixed point iterations, 409201 contraction evaluations, 8.3e40 objective value
I have tested quadrature "sizes" ranging from 3 to 7, and the problem is persists.
Sometimes the quadrature method reports failure to converge, but other times it reports convergence to absurd objective values. For the cereal data, MC, MLHS, and Halton converge to approximately 1.7e02. Quadrature "converges" to 6.0e40 in spite of reporting the fixed point failing to converge and divide by zero errors at each of the 4 iterations.