JMSLab / xtevent

Stata package -xtevent-
MIT License
43 stars 12 forks source link

Check smoothest line solution #136

Closed jorpppp closed 1 year ago

jorpppp commented 1 year ago

Per discussion by @SimonFreyaldenhoven and @santiagohermo:

There was a small bug at the end of the optimization. After substituting in the constraints, we have the following:

image

Let's ignore the sign indeterminacy for now and assume we always have a plus in (22). We want to minimize the absolute value of v_2, so in practice we actually minimize f(v_b)=v^2_2. The solution to that optimization yields (v_b^ and f(v_b^)) as argmin and min respectively.

Before we then set v_2^ = sqrt(f(v_b^)). But note that this is not unique. There is another sign indeterminacy here since the square root can be both negative and positive. We always chose the positive one, so I guess we had a 50% chance of picking the correct sign. For example, in the case of a quadratic polynomial, by always using the positive coefficient on v_2 we forced the curvature to be positive. In the examples in the markdown that happened to be the wrong direction, which is why we never found a solution.

Anyways, the fix was simple. After we obtain the solution (v_b^ and f(v_b^ )) as argmin and min respectively, we then just compute v_2^ by plugging v_b^ into (22).

I'll check the Stata implementation, and, if the indeterminacy issue is present, I'll fix it using the steps above.

jorpppp commented 1 year ago

@SimonFreyaldenhoven @santiagohermo I think that Stata was already doing this correctly. See the corresponding lines in xteventplot.ado:

https://github.com/JMSLab/xtevent/blob/39031a9d96add40b49a57fb08f0fc499abbb70da/xtevent/xteventplot.ado#L820-L835

@santiagohermo produced a helpful markdown checking the smoothest line functionality for many cases. Let me produce a similar markdown here to see how Stata is behaving.

jorpppp commented 1 year ago

@santiagohermo @SimonFreyaldenhoven I ran some tests and I think Stata's performance for finding the smoothest line is good. At least, I am much less concerned now compared to when I started the issue.

In the same setting as the one @santiagohermo tested, Stata found a "success" (e.g. optimal Wald is in the boundary) in all the cases, and failed to converge only in one of them. That said, I had to change the default optimizer to get this performance. I am now using Newton-Rhapson - BFGS (I was using DFP before).

I think Stata was giving too many warning messages. It was giving a convergence/flat region warning when choosing the "wrong" root, which I think shouldn't be the case. It converges in 71/72 cases when the "right" root is chosen. I changed that behavior in https://github.com/JMSLab/xtevent/commit/5fc7be42e6b814d90f8f6a890ed843c40432cb42

Here is a log file with the results. I'll upload the markdown with the plots later, I don't have access to Git LFS right now.

SimonFreyaldenhoven commented 1 year ago

@jorpppp Thanks for looking into it, looks like the optimization works well!

jorpppp commented 1 year ago

@santiagohermo @SimonFreyaldenhoven I uploaded the data and the markdown. Everything looks good as far as I can tell. The only problematic case is pre=4 post=7 for the jump outcome, where the solver did not converge. Even then the path doesn't look too bad, but it is different from the one in EventStudyR.

If I remember correctly R is using Nelder-Mead to optimize. I played with it yesterday but couldn't get it quite right in Stata. I propose we leave implementantion of Nelder-Mead as a future enhancement.

Let me know if you have more comments before we wrap this up.

SimonFreyaldenhoven commented 1 year ago

@jorpppp thanks, looks great.

Just one small question: When the optimization fails - what do we do right now? (e.g. print the path anyways, do not include path and print a message, something else?)

jorpppp commented 1 year ago

@SimonFreyaldenhoven We print the path anyway and display a warning message. Here's how it looks like:

. xtevent y_jump_m x_r, window(-4 7) policy(z)
(Output deleted)
. xteventplot, smpath(line)
Note: Smoothest line drawn for system confidence level = 95
Warning: Smoothest path optimization returned an error code. Results for the smoothest path are approximate. Try changing the optimizatio
> n options
Error code = 4. See mf_optimize##r_error to see what that means.
Error code = 4. See mf_optimize##r_error to see what that means.
SimonFreyaldenhoven commented 1 year ago

@jorpppp If the optimization results in an error I have a slight preference for not returning the path and stating we can't find one. Without knowing more, it seems more likely than not that the path that we currently print does not correspond to the object we want. That is unless we're somehow confident that in those cases we're "very close" to the optimal value and it's just a numerical issue to get it exactly right.

jorpppp commented 1 year ago

@SimonFreyaldenhoven Got it, makes sense. I'll change the error messages and the behavior when there are optimization errors.

jorpppp commented 1 year ago

@SimonFreyaldenhoven In https://github.com/JMSLab/xtevent/commit/b4e1d5c6149ef93107ad48a1cbbc13ad847cf4e3 I changed the behavior so the smoothest path is not displayed if the optimization fails. The error message looks like this now:

Results for y_jump_m, pre = 4, post = 7

Note: Smoothest line drawn for system confidence level = 95%

The optimization to calculate the smoothest path returned an error code.
Smoothest path won't be displayed.
Try changing the optimization options.

Error code = 4. See mf_optimize##r_error to see what that means.
Error code = 4. See mf_optimize##r_error to see what that means.

I think this is almost done so I'll start a PR.

jorpppp commented 1 year ago

Summary: In this issue we checked the smoothest path optimization in Stata. We changed the error messages and modified the command so the smoothest path is not displayed if optimization fails.

Thread continues in #142

SimonFreyaldenhoven commented 1 year ago

@jorpppp thanks! looks great!